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To my son 


Preface 


I. This book is above all addressed to mathematicians. It is intended to be 
a textbook of mathematical logic on a sophisticated level, presenting the 
reader with several of the most significant discoveries of the last ten or 
fifteen years. These include: the independence of the continuum hypothe- 
sis, the Diophantine nature of enumerable sets, the impossibility of finding 
an algorithmic solution for one or two old problems. 

All the necessary preliminary material, including predicate logic and the 
fundamentals of recursive function theory, is presented systematically and 
with complete proofs. We only assume that the reader is familiar with 
“naive” set theoretic arguments. 

In this book mathematical logic is presented both as a part of mathe- 
matics and as the result of its self-perception. Thus, the substance of the 
book consists of difficult proofs of subtle theorems, and the spirit of the 
book consists of attempts to explain what these theorems say about the 
mathematical way of thought. 

Foundational problems are for the most part passed over in silence. 
Most likely, logic is capable of justifying mathematics to no greater extent 
than biology is capable of justifying life. 


2. The first two chapters are devoted to predicate logic. The presenta- 
tion here is fairly standard, except that semantics occupies a very domi- 
nant position, truth is introduced before deducibility, and models of 
speech in formal languages precede the systematic study of syntax. 

The material in the last four sections of Chapter IJ is not completely 
traditional. In the first place, we use Smullyan’s method to prove Tarski’s 
theorem on the undefinability of truth in arithmetic, long before the 
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introduction of recursive functions. Later, in the seventh chapter, one of 
the proofs of the incompleteness theorem is based on Tarski’s theorem. In 
the second place, a large section is devoted to the logic of quantum 
mechanics and to a proof of von Neumann’s theorem on the absence of 
“hidden variables” in the quantum mechanical picture of the world. 

The first two chapters together may be considered as a short course in 
logic apart from the rest of the book. Since the predicate logic has received 
the widest dissemination outside the realm of professional mathematics, 
the author has not resisted the temptation to pursue certain aspects of its 
relation to linguistics, psychology, and common sense. This is all discussed 
in a series of digressions, which, unfortunately, too often end up trying to 
explain “the exact meaning of a proverb” (E. Baratynskii '). This series of 
digressions ends with the second chapter. 

The third and fourth chapters are optional. They are devoted to com- 
plete proofs of the theorems of Gédel and Cohen on the independence of 
the continuum hypothesis. Cohen forcing is presented in terms of 
Boolean-valued models; Gédel’s constructible sets are introduced as a 
subclass of von Neumann’s universe. The number of omitted formal 
deductions does not exceed the accepted norm; due respects are paid to 
syntactic difficulties. This ends the first part of the book: “Provability.” 

The reader may skip the third and fourth chapters, and proceed im- 
mediately to the fifth. Here we present elements of the theory of recursive 
functions and enumerable sets, formulate Church’s thesis, and discuss the 
notion of algorithmic undecidability. 

The basic content of the sixth chapter is a recent result on the Di- 
ophantine nature of enumerabie sets. We then use this result to prove the 
existence of versal families, the existence of undecidable enumerable sets, 
and, in the seventh chapter, Gédel’s incompleteness theorem (as based on 
the definability of provability via an arithmetic formula). Although it is 
possible to disagree with this method of development, it has several 
advantages over earlier treatments. In this version the main technical effort 
is concentrated on proving the basic fact that all enumerable sets are 
Diophantine, and not on the more specialized and weaker results concern- 
ing the set of recursive descriptions or the Gédel numbers of proofs. 


' Nineteenth century Russian poet (translator’s note). The full poem is: 


We diligently observe the world, 

We diligently observe people, 

And we hope to understand their deepest meaning. 
But what is the fruit of long years of study? 

What do the sharp eyes finally detect? 

What does the haughty mind finally learn 

At the height of all experience and thought, 
What?—the exact meaning of an old proverb. 


1828 
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The last section of the sixth chapter stands somewhat apart from the 
rest. It contains an introduction to the Kolmogorov theory of complexity, 
which is of considerable general mathematical interest. 

The fifth and sixth chapters are independent of the earlier chapters, and 
together make up a short course in recursive function theory. They form 
the second part of the book: “Computability.” 

The third part of the book, “Provability and Computability,” relies 
heavily on the first and second parts. It also consists of two chapters. All of 
the seventh chapter is devoted to Gédel’s incompleteness theorem. The 
theorem appears later in the text than is customary because of the belief 
that this central result can only be understood in its true light after a solid 
grounding both in formal mathematics and in the theory of computability. 
Hurried expositions, where the proof that provability is definable is en- 
tirely omitted and the mathematical content of the theorem is reduced to 
some version of the “liar paradox,” can only create a distorted impression 
of this remarkable discovery. The proof is considered from several points 
of view. We pay special attention to properties which do not depend on the 
choice of Gédel numbering. Separate sections are devoted to Feferman’s 
recent theorem on Gddel formulas as axioms, and to the old but very 
beautiful result of Gddel on the length of proofs. 

The eighth and final chapter is, in a way, removed from the theme of 
the book. In it we prove Higman’s theorem on groups defined by enumer- 
able sets of generators and relations. The study of recursive structures, 
especially in group theory, has attracted continual attention in recent 
years, and it seems worthwhile to give an example of a result which is 
remarkable for its beauty and completeness. 


3. This book was written for very personal reasons. After several years 
or decades of working in mathematics, there almost inevitably arises the 
need to stand back and look at this research from the side. The study of 
logic is, to a certain extent, capable of fulfilling this need. 

Formal mathematics has more than a slight touch of self-caricature. Its 
structure parodies the most characteristic, if not the most important, 
features of our science. The professional topologist or analyst experiences a 
strange feeling when he recognizes the familiar pattern glaring out at him 
in stark relief. 

This book uses material arrived at through the efforts of many mathe- 
maticians. Several of the results and methods have not appeared in 
monograph form; their sources are given in the text. The author’s point of 
view has formed under the influence of the ideas of Hilbert, Gddel, Cohen, 
and especially John von Neumann, with his deep interest in the external 
world, his open-mindedness and spontaneity of thought. 

Various parts of the manuscript have been discussed with Yu. V. 
Matijasevié, G. V. Cudnovskii, and S. G. Gindikin. I am deeply grateful to 
all of these colleagues for their criticism. 
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W. D. Goldfarb of Harvard University very kindly agreed to proofread 
the entire manuscript. For his detailed corrections and laborious rewriting 
of part of Chapter IV, I owe a special debt of gratitude. 

I wish to thank Neal Koblitz for his meticulous translation. 


Yu. I. Manin 
Moscow, September 1974 
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CHAPTER I 


Introduction to formal languages 


Gelegentlich ergreifen wir die Feder 

Und schreiben Zeichen auf ein weisses Blatt, 
Die sagen dies und das, es kennt sie jeder, 
Es ist ein Spiel, das seine Regeln hat. 


H. Hesse, ““Buchstaben” 


We now and then take pen in hand 
And make some marks on empty paper. 
Just what they say, all understand. 

It is a game with rules that matter. 


H. Hesse, “Alphabet” 
(translated by Prof. Richard S. Ellis) 


1 General information 


1.1. Let A be any abstract set. We call A an alphabet. Finite sequences of 
elements of A are called expressions in A. Finite sequences of expressions 
are called texts. 

We shall speak of a /anguage with alphabet A if certain expressions and 
texts are distinguished (as being “‘correctly composed,” “meaningful,” etc.). 
Thus, in the Latin alphabet A we may distinguish English word forms and 
grammatically correct English sentences. The resulting set of expressions 
and texts is a working approximation to the intuitive notion of the 
“English language.” 

The language Algol 60 consists of distinguished expressions and texts in 
the alphabet {Latin letters} U {digits} U {logical signs} U {separators}. 
Programs are among the most important distinguished texts. 
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In natural languages the set of distinguished expressions and texts 
usually has unsteady boundaries. The more formal the language, the more 
rigid these boundaries are. 

The rules for forming distinguished expressions and texts make up the 
syntax of the language. The rules which tell how they correspond with 
reality make up the semantics of the language. Syntax and semantics are 
described in a metalanguage. 


1.2. “Reality” for the languages of mathematics consists of certain classes 
of (mathematical) arguments or certain computational processes using 
(abstract) automata. Corresponding to these designations, the languages 
are divided into formal and algorithmic languages. (Compare: in natural 
languages, the declarative versus imperative moods, or—on the level of 
texts—statement versus command.) 

Different formal languages differ from one another, in the first place, by 
the scope of the formalizable types of arguments—their expressiveness; in 
the second place, by their orientation toward concrete mathematical theo- 
ries; and in the third place, by their choice of elementary modes of 
expression (from which all others are then synthesized) and written forms 
for them. 

In the first part of this book a certain class of formal languages is 
examined systematically. Algorithmic languages are brought in episodi- 
cally. 

The “language—parole” dichotomy, which goes back to Humboldt and 
Saussure, is as relevant to formal languages as to natural languages. In §3 
of this chapter we give models of “speech” in two concrete languages, 
based on set theory and arithmetic, respectively; because, as many believe, 
habits of speech must precede the study of grammar. 

The language of set theory is among the richest in expressive means, 
despite its extreme economy. In principle, a formal text can be written in 
this language corresponding to almost any segment of modern mathema- 
tics—topology, functional analysis, algebra, or logic. 

The language of arithmetic is one of the poorest, but its expressive 
possibilities are sufficient for describing all of elementary arithmetic, and 
also for demonstrating the effects of self-reference 4 la Godel and Tarski. 


1.3. As a means of communication, discovery, and codification, no formal 
language can compete with the mixture of mathematical argot and for- 
mulas which is common to every working mathematician. 

However, because they are so rigidly normalized, formal texts can 
themselves serve as an object for mathematical investigation. The results of 
this investigation are themselves theorems of mathematics. They arouse 
great interest (and strong emotions) because they can be interpreted as 
theorems about mathematics. But it is precisely the possibility of these and 
still broader interpretations that determines the general philosophical and 
human value of mathematical logic. 
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1 General information 


1.4. We have agreed that the expressions and texts of a language are 
elements of certain abstract sets. In order to work with these elements, we 
must somehow fix them materially. In the modern European tradition (as 
opposed to the ancient Babylonian tradition, or the latest American 
tradition, using computer memory), the following notation is customary. 
The elements of the alphabet are indicated by certain symbols on paper 
(letters of different kinds of type, digits, additional signs, and also combi- 
nations of these). An expression in an alphabet A is written in the form of 
a sequence of symbols, read from left to right, with hyphens when 
necessary. A text is written as a sequence of written expressions, with 
spaces or punctuation marks between them. 


1.5. If written down, most of the interesting expressions and texts in a 
formal language either would be physically extremely long, or else would 
be psychologically difficult to decipher and learn in an acceptable amount 
of time, or both. 

They are therefore replaced by “abbreviated notation” (which can 
sometimes turn out to be physically longer). The expression “xxxxxx” can 
be briefly written “x - - - x (six times)” or “x°.” The expression “Wz(z € x 
<z € y)” can be briefly written “x = y.” Abbreviated notation can also be 
a way of denoting any expression of a definite type, not only a single such 
expression; (any expression 101010--- 10 can be briefly written “the 
sequence of length 2 with ones in odd places and zeros in even places” or 
“the binary expansion of 2 (4" — 1).”) 

Ever since our tradition started, with Vieta, Descartes, and Leibniz, 
abbreviated notation has served as an inexhaustible source of inspiration 
and errors. There is no sense in, or possibility of, trying to systematize its 
devices; they bear the indelible imprint of the fashion and spirit of the 
times, the artistry and pedantry of the authors. The symbols &, f, © are 
classical models worthy of imitation. Frege’s notation, now forgotten, for 
“P and Q” (actually “not [if P, then not Q],” whence the asymmetry): 


Q 


P 


shows what should be avoided. In any case, abbreviated notation per- 
meates mathematics. 
The reader should become used to the trinity 


formal text 


written text-——in terpretation of text, 


which replaces the unconscious identification of a statement with its form 
and its sense, as one of the first priorities in his study of logic. 
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2 First order languages 


In this section we describe the most important class of formal languages £, 
—the first order languages—and give two concrete representatives of this 
class: the Zermelo—Fraenkel language of set theory L,Set, and the Peano 
language of arithmetic L,Ar. Another name for £, is predicate languages. 


2.1. The alphabet of any language in the class 2, is divided into six disjoint 
subsets. The following table lists the generic name for the elements in each 
subset, the standard notation for these elements in the general case, the 
special notation used in this book for the languages L,Set and L,Ar. We 
then describe the rules for forming distinguished expressions and briefly 
discuss semantics. 

The distinguished expressions of any language Z in the class £, are 
divided into two types: terms and formulas. Both types are defined recur- 
sively. 


2.2. Definition. Terms are the elements of the least subset of the expres- 
sions of the language which satisfies the two conditions: 


(a) Variables and constants are (atomic) terms. 
(b) If f is an operation of degree r and f¢,,.. 
F(t, ..., 4) is a term, 


.,¢, are terms, then 


In (a) we identify an element with a sequence of length one. The 
alphabet does not include commas, which are part of our abbreviated 
notation: f(t), ¢,, f;) means the same as f(f,t,l,). In §1 of Chapter II we 


Language Alphabets 


Subsets of 
the Alphabet 


connectives and 
quantifiers 


variables 


constants 


Names and Notation 


General in L,Set 


= (equivalent); = (implies); \/ (inclusive or); /\ (and); 
— (not); V (universal quantifier); 4 (existential quantifier) 


in L,Ar 


X,Y, Z,u, v,... with indices 


- with indices | @ (empty set) 0 (zero); 1 (one) 


operations of 
degree 
125 Sees 


relations (pred- 
icates) of degree 


[1,2.3,.-: 


Sf, g,... with none + (addition, degree 2); 

indices - (multiplication, degree 2) 
& (is an element = (equality, degree 2) 

P.4,... with of, degree 2); 

indices = (equals, degree 2) 


parentheses 


((left parenthesis); )(right parenthesis) 
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2 First order languages 


explain how a sequence of terms can be uniquely deciphered despite the 
absence of commas. 

If two sets of expressions in the language satisfy conditions (a) and (b), 
then the intersection of the two sets also satisfies these conditions. There- 
fore the definition of the set of terms is correct. 


2.3. Definition. Formulas are the elements of the least subset of the 
expressions of the language which satisfies the two conditions: 


(a) If p is a relation of degree r and ¢,,...,¢, are terms, then 
p(t,,..., ¢,) is an (atomic) formula. 

(b) If P and Q are formulas (abbreviated notation!), and x is a variable, 
then the expressions 


(P)=(Q), (P)3=(Q@), (P)V(Q), (PIA(Q), 
(P), WX(P), AXx(P) 
are formulas. 


It is clear from the definitions that any term is obtained from atomic 
terms in a finite number of steps, each of which consists in “applying an 
operation symbol” to the earlier terms. The same is true for formulas. In 
Chapter II, §1 we make this remark more precise. 

The following initial interpretations of terms and formulas are given for 
the purpose of orientation and belong to the so-called “standard models” 
(see Chapter II, §2 for the precise definitions). 


2.4. EXAMPLES AND INTERPRETATIONS 
(a) The terms stand for (are notation for) the objects of the theory. 
Atomic terms stand for indeterminate objects (variables) or concrete 
objects (constants). The term f(t),...,¢,) is the notation for the object 
obtained by applying the operation denoted by f to the objects denoted by 
t,,...,¢,. Here are some examples from L,Ar: 
0 denotes zero; 
1 denotes one; 
+(1, 1) denotes two (1 + 1 =2 in the usual notation); 
+ (1 +(1, 1)) denotes three; 
. (+ (1, 1) +(1, 1)) denotes four (2 x 2 = 4). 


Since this normalized notation is different from what we are used to in 
arithmetic, in L,Ar we shall usually write simply ¢, + ¢, instead of +(¢), 4) 
and f,-t, instead of -(t,, ¢,). This convention may be considered as another 
use of abbreviated notation. 

x stands for an indeterminate integer; 


x +1 (or +(x, 1) stands for the next integer. 
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In the language L,Set all terms are atomic: 


x stands for an indeterminate set: 
@ stands for the empty set. 


(b) The formulas stand for statements (arguments, propositions, .. . ) of 
the theory. When translated into formal language, a statement may be 
either true, false, or indeterminate (if it concerns indeterminate objects): 
see Chapter II for the precise definitions. In the general case the atomic 
formula p(¢,,....,1,) has roughly the following meaning: “The ordered 
r-tuple of objects denoted by 7,,...,4¢, has the property denoted by p.” 
Here are some examples of atomic formulas in L,Ar. Their general 
structure is =(f), f), or, in nonnormalized notation, f; = 1: 


0=1, x+l=y. 


Here are some examples of formulas which are not atomic: 


Some atomic formulas in L,Set: 
yEex (y is an element of x), 


and also @ Ey, x € @, etc. Of course, normalized notation must have the 
form €(xy), and so on. 
Some nonatomic formulas: 


ax(Vy( A(y € x))): there exists an x of which no y is an element. 


Informally this means: “The empty set exists.” We once again recall that 
an informal interpretation presupposes some standard interpretive system, 
which will be introduced explicitly in Chapter I. 


Vy(y Ezy Ex): z is a subset of x. 


This is an example of a very useful type of abbreviated notation: four 
parentheses are omitted in the formula on the left. We shall not specify 
precisely when parentheses may be omitted; in any case, it must be 
possible to reinsert them in a way that is unique or is clear from the 
context without any special effort. 

We again emphasize: the abbreviated notation for formulas are only 
material designations. Abbreviated notation is chosen for the most part 
with psychological goals in mind: speed of reading (possibly with a loss in 
formal uniqueness), tendency to encourage useful associations and dis- 
courage harmful ones, suitability to the habits of the author and reader, 


Digression: names 


and so on. The mathematical objects in the theory of formal languages are 
the formulas themselves, and not any particular designations. 


Digression: names 


On several occasions we have said that a certain object (a sign on paper, 
an element of an alphabet as an abstract set, etc.) is a notation for, or 
denotes, another element. A convenient general term for this relationship is 
naming. 

The letter x is the name of an element of the alphabet; when it appears 
in a formula, it becomes the name of a set or a number; the notation x € y 
is the name of an expression in the alphabet A, and this expression, in turn, 
is the name of an assertion about indeterminate sets; and so on. 

When we form words, we often identify the names of objects with the 
objects themselves: we say “the variable x,” “the formula P,” “the set z.” 
This can sometimes be dangerous. The following passage from Rosser’s 
book Logic for Mathematicians points up certain hidden pitfalls: 


2 6 


The gist of the matter is that, if we have a statement such as “3 is 
greater than 3” about the rational number 3 and containing a name “3” 
of this rational number, one can replace this name by any other name of 


the same rational number, for instance, “?.” If we have a statement such 
as “3 divides the denominator of ‘;;’” about a name of a rational number 
and containing a name of this name, one can replace this name of the 
name by some other name of the same name, but not in general by the 
name of some other name, if it is a name of some other name of the same 
rational number. 


Rosser adds that “failure to observe such distinctions carefully can seldom 
lead to confusion in logic and still more seldom in mathematics.” How- 
ever, these distinctions play a significant role in philosophy and in 
mathematical practice. 

“A rose by any other name would smell as sweet”—this is true because 
roses exist outside of us and smell in and of themselves. But, for example, 
it seems that Hilbert spaces only “exist” insofar as we talk about them, and 
the choice of terminology here makes a difference. The word “space” for 
the set of equivalence classes of square integrable functions was at the 
same time a codeword for an entire circle of intuitive ideas concerning 
“real” spaces. This word helped organize the concept and led it in the right 
direction. 

A successfully chosen name is a bridge between scientific knowledge 
and common sense, between new experience and old habits. The concep- 
tual foundation of any science consists of a complicated network of names 
of things, names of ideas, and names of names. It evolves itself, and its 
projection on reality changes. 
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3 Beginners’ course in translation 


3.1. We recall that the formulas in LSet stand for statements about sets; 
the formulas in L,Ar stand for statements about natural numbers; these 
formulas contain names of sets and numbers, which may be indeterminate. 

In this section we give the first basic examples of two-way translation 
“argote>formal language.” One of our purposes will be to indicate the 
great expressive possibilities in L,Set and L,Ar, despite the extremely 
limited modes of expression. 

As in the case of natural languages, this translation cannot be given by 
rigid rules, is not uniquely determined, and is a creative process. Compare 
Hesse’s quatrain with its translation in the epigraph to this book: the most 
important aim of translation is to “understand . . . just what they say.” 

Before reading further, the reader should look through the Appendix to 
Chapter II: “The von Neumann Universe.” The semantics implicit in L,Set 
relates to this universe, and not to arbitrary “Cantor” sets. 

A more complete picture of the meaning of the formulas can be 
obtained from §2 of Chapter II. 


Translation from LSet to argot. 


3.2. Wx( (x © @)): “for all (sets) x it is false that x is an element of (the 
set) @” (or “@ is the empty set”). 

The second assertion is only equivalent to the first in the von Neumann 
universe, where the elements of sets can only be sets, and not real 
numbers, chairs, or atoms. 


3.3. Wz(z Exeez Eyleox =y: “if for all z it is true that z is an element of 
x if and only if z is an element of y, then it is true that x coincides with y; 
and conversely,” or “a set is uniquely determined by its elements.” 

In the expression 3.3 at least six parentheses have been omitted; and the 
subformulas z € x, z © y, x = y have not been normalized according to the 
rules of £,. 


3.4. Wu Vo Ax V2(z Exe(z =u\/z=v)): “for any two sets u,v there 
exists a third set x such that u and v are its only elements.” 

This is one of the axioms of Zermelo—Fraenkel. The set x is called the 
“unordered pair of sets u, v” and is denoted {u, v} in the Appendix. 


3.5. Wy W2(((z Ey Ay Ex) >z Ex) A(y Ex= Ay Ey))): “the set x 
is partially ordered by the relation € between its elements.” 

We mechanically copied the condition y€ x= —4(y Ey) from the 
definition of partial ordering. This condition is automatically fulfilled in 
the von Neumann universe, where no set is an element of itself. 
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3 Beginners’ course in translation 


A useful exercise would be to write the following formulas: 


“x is totally ordered by the relation €”; 
“x is linearly ordered by the relation €”; 
“x is an ordinal.” 


3.6. VWx(y € z): The literal translation “for all x it is true that y is an 
element of z” sounds a little strange. The formula Vx 3x(y € z), which 
agrees with the rules for constructing formulas, looks even worse. It would 
be possible to make the rules somewhat more complicated, in order to rule 
out such formulas, but in general they cause no harm. In Chapter II we 
shall see that, from the point of view of “truth” or “deducibility,” such a 
formula is equivalent to the formula y € z. It is in this way that they must 
be understood. 


Translation from argot to L,Set. 


We choose several basic constructions having general mathematical signifi- 
cance and show how they are realized in the von Neumann universe, which 
only contains sets obtained from @ by the process of “collecting into a 
set,” and in which all relations must be constructed from €. 


3.7. “x is the direct product y X z.” 

This means that the elements of x are the ordered pairs of elements of y 
and z, respectively. The definition of an unordered pair is obvious: the 
formula 


Vu(ue xo(u=y,u=z,)) 


“means,” or may be briefly written in the form, x = { y,, z,;} (compare 3.4). 
The ordered pair y, and z, is introduced using a device of Kuratowski and 
Wiener: this is the set x, whose elements are the unordered pairs { y,, y,} 


and { y,, 2}. 
We thus arrive at the formula 


Ay, 3z2(“x) = (Ya Z2)"AY2 = (Ye i ?A“22 = (¥p 21)"); 
which will be abbreviated 
X,= Vp 2) 

and will be read: “x, is the ordered pair with first element y, and second 
element z,.” The abbreviated notation for the subformulas is in quotes; we 
shall later omit the quotation marks. 

Finally, the statement “x = y X z” may be written in the form: 

Vx,(x, Exedy, 3z,(9, Ey Az, €2A“X, = Op 2”))- 


In order to remind the reader for the last time of the liberties taken in 
abbreviated notation, we write this same formula adhering to all the 
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I Introduction to formal languages 


canons of {°,: 
f 


Vx, (e(%x)) 


=> 


= (35 ((eomalecen) /\ (30 (32,(((vu(( €(ux,)) 


> ((=(w2)) V (= (uzq))))) A (Wu (( E(w) 


(= (w)))))A(¥u((E len) (C= ())V(= =) 


EXERCISE: Find the open parenthesis corresponding to the fifth closed parenthesis 
from the end. In §1 of Chapter II we give an algorithm for solving such problems. 


3.8. “f is a mapping from the set u to the set v.” 

First of all, mappings, or functions, are identified with their graphs; 
otherwise, we would not be able to consider them as elements of the 
universe. The following formula successively imposes three conditions on 
f: fis a subset of u X v; the projection of f onto u coincides with all of u; 
and, each element of u corresponds to exactly one element of v: 


W2(z € f= (Au, Jv,(u, EuAv, Ev A“z = (u,, v,>”))) 
AWu,(u, €uadv, 3z(v, EvA“z = (my, 0’ Az Ef)) 
AWu, Wo, Wo, (Az, 32.(2z, Ef A 22 Ef A821 = (uy ODA“ 2g = Cy 029”) 


=0v,= v>). 


EXERCISE: Write the formula “f is the projection of y X z onto z.” 


3.9. “x is a finite set.” 

Finiteness is far from being a primitive concept. Here is Dedekind’s 
definition: “there does not exist a one-to-one mapping f of the set x onto a 
proper subset.” The formula: 


aay (Cf is a mapping from x to x” A Wu, Wu, We, Wo, ((“uy, 0D ES” 
A “Ku, 02> ES? A Ay = Uy) > 70, = vy)) AA, (v, Ex/ 734, 
(uy, © €f")))- 
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3 Beginners’ course in translation 


The abbreviation “<u,, v,> € f” means, of course, dy(“y = (uy, vp >” AY € 


J). 


3.10. “x is a nonnegative integer.” 
The natural numbers are represented in the von Neumann universe by 
the finite ordinals, so that the required formula has the form: 


“x is totally ordered by the relation ©” A\“‘x 1s finite.” 


EXERCISE: Figure out how to write the formulas “x + y = z” and “‘x-y = z,” where 
x,y, zZ are integers > 0. 


After this it is possible in the usual way to write the formulas “x is an 
integer,” “x is a rational number,” “x is a real number” (following Cantor 
or Dedekind), etc., and then construct a formal version of analysis. The 
written statements will have acceptable length only if we periodically 
extend the language L|Set (see §8 of Chapter IT). For example, in L,Set we 
are not allowed to write term-names for the numbers 1, 2, 3,... (@ is the 
name for 0), although we may construct the formulas “x is the finite 
ordinal containing 1 element,” “x is the finite ordinal containing 2 ele- 
ments,” etc. If we use such roundabout methods of expression, the simplest 
numerical identities become incredibly long; but, of course, in logic we are 
mainly concerned with the theoretical possibility of writing them. 
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3.11. “x is a topological space.” 

In the formula we must give the topology of x explicitly. We define the 
topology, for example, in terms of the set y of all open subsets of x. We 
first write that y consists of subsets of x and contains x and the empty set: 


Py: Wz(zE€ySVu(uEzsuEx)AxEyAGEy. 
The intersection w of any two elements wu, v in y is open, i.e., belongs to y: 
Py: Wu Wo Vw((u EyAveyAVz((z GuAz €v)ez Ew))>w Ey). 


It is harder to write “the union of any set of open subsets is open.” We 
first write: 


P3: Wu(uezevo(v Eusvey)), 
that is, “z is the set of all subsets of y.” Then: 
Py Wu Vw((u EzAVv, (vo, Ew Av(v EuAv, € v)))=w ey). 


This means (taking into account P;, which defines z): “If u is any subset of 
y, 1e., a set of open subsets of x, then the union w of all these subsets 
belongs to y, i.e., is open.” Now the final formula may be written as 
follows: 


Pi A Px ANV2(P3= Py). 
13 


I Introduction to formal languages 


The following comments on this formula will be reflected in precise 
definitions in Chapter I], §§1 and 2. The letters x,y have the same 
meaning in all the P,, while z plays different roles: in P, it is a subset of x, 
and in P; and P, it is the set of subsets of x. We are allowed to do this 
because, as soon as we “bind” z by the quantifier V, say in P,, z no longer 
stands for an (indeterminate) individual set, and becomes a temporary 
designation for “any set.”” Where the “scope of action” of V ended, z can 
be given a new meaning. In order to “free” z for later use, Vz was also put 
before P,;=> P,. 


Translation from argot to L,Ar. 


3.12. “x<y”: 3z(y =(x+z)+ 1). Recall that the variables are names 
for nonnegative integers. 


3.13. “x is a divisor of y” : Az(y = x-z). 


3.14. “x is a prime number”: ae x” A(“y is a divisor of x°=>(y = Tvy = 
x)). 


3.15. “Fermat's big theorem”: Wx, Wx Vx; Wu('2 << uw? A%xy + x= 
x#"=>“x)x5xX; = 0”). It is not clear how to write the formula x} + x7 = xj 
in L,Ar. Of course, for any concrete u = 1, 2,3 there is a corresponding 
atomic formula in L,Ar, but how do we make u into a variable? This is not 
a trivial problem. In the second part of the book we show how to find an 
atomic formula p(x, u,y,2Z,,.--.2Z,) such that the assertion that 
3z,°- + 5z,p(x,u,y,2Z,,...,Z,) in the domain of natural numbers is 
equivalent to y = x“. Then xj‘ + xj = x} can be translated as follows: 


dy, Ay, Ay (“xf = VP AG xt = V2 Axe =a? ANi t+ ¥2 = y3). 
The existence of such a p is a nontrivial number theoretic fact, so that here 


the very possibility of performing a translation becomes a mathematical 
problem. 


3.16. “The Riemann hypothesis.” The Riemann zeta-function ¢(s) is defined 
by the series }%_,n~° in the halfplane Res > 1. It can be continued 
meromorphically onto the entire complex s-plane. The Riemann hypothe- 
sis is the assertion that the nontrivial zeros of ¢(s) lie on the line Re s = }. 
Of course, in this form the Riemann hypothesis cannot be translated into 
L,Ar. However, there are several purely arithmetic assertions which are 
demonstrably equivalent to the Riemann hypothesis. Perhaps the simplest 
of them is the following. 

Let u(n) be the Mobius function on the set of integers > I: it equals 0 if 
n is divisible by a square, and equals (— 1)’, where is the number of prime 
divisors of n, if n is square-free. We then have: 


Ss u(n) 


n=] 


Riemann hypothesise@We > 0 Ax Vy | y > x=> | csi . 
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3 Beginners’ course in translation 
ginn 


Only the exponent is not an integer on the right; but e need only run 
through numbers of the form 1/z, z an integer > 1, and then we can raise 
the inequality to the (2z)th power. The formula 


( Ss n(n) cyt? 


n=! 


can then be translated into L,Ar, although not completely trivially. The 
necessary techniques will be developed in the second part of the book. 
The last two examples were given in order to show the complexity that 
is possible in problems which can be stated in L,Ar, despite the apparent 
simplicity of the modes of expression and the semantics of the language. 
We conclude this section with some remarks concerning higher order 
languages. 


3.17. Higher order languages. Let L be any first order language. Its modes 
of expression are limited in principle by one important consideration: we 
are not allowed to speak of arbitrary properties of objects of the theory, 
that is, arbitrary subsets of the set of all objects. Syntactically, this is 
reflected in the prohibition against forming expressions such as Vp(p(x)), 
where p is a relation of degree 1; relations must stand for fixed rather than 
variable properties. 

Of course, certain properties can be defined using nonatomic formulas. 
For example, in L,Ar instead of “x is even” we may write 
Ay(x =(1+1)- y). However, there is a continuum of subsets of the in- 
tegers but only a countable set of definable properties (see §2 of Chapter 
II), so there are automatically properties which cannot be defined by 
formulas. Thus, it is impossible to replace the forbidden expression 
Vp(p(x)) by a sequence of expressions P,(x), P(x), P(x)... - 

Languages in which quantifiers may be applied to properties and/or 
functions (and also, possibly, to properties of properties, and so on) are 
called higher order languages. One such language—L,Real—will be con- 
sidered in Chapter III for the purpose of illustrating a simplified version of 
Cohen forcing. 

On the other hand, the same extension of expressive possibilities can be 
obtained without leaving £,. In fact, in the first order language L,Set we 
may quantify over all subsets of any set, over all subsets of the set of 
subsets, and so on. Informally this means we are speaking of all properties, 
all properties of properties,... (with transfinite extension). In addition, 
any higher order language with a “standard interpretation” in some type of 
structured sets can be translated into L,Set so as to preserve the meanings 
and truth values in this standard interpretation. (An apparent exception is 
the languages for describing Gédel-Bernays classes and “large” categories; 
but it seems, based on our present understanding of paradoxes, that no 
higher order languages can be constructed from such a language.) 
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The attentive reader will notice the contrast between the possibility of 
writing a formula in L,Set in which V is applied to all subsets (informally, 
to all properties) of finite ordinals (informally, of integers), and the 
impossibility of writing a formula in L,Set which would define any concrete 
subset in the continuum of undefinable subsets. (There are fewer such 
subsets in L,Set than in L,Ar, but still a continuum.) We shall examine 
these problems more closely in Chapter II when we discuss “Skolem’s 
paradox.” 

Let us summarize. Almost all the basic logical and set theoretic princi- 
ples used in the day to day work of the mathematician are contained in the 
first-order languages and, in particular, in L,Set. Hence, those languages 
will be the subject of study in the first and third parts of the book. But 
concrete oriented languages can be formed in other ways, with various 
degrees of deviation from the rules of £,. In addition to L,Real, examples 
of such languages examined in Chapter IIT include SELF (Smullyan’s 
language for self-description) and SAr, which is a language of arithmetic 
convenient for proving Tarski’s theorem on the undefinability of truth. 


Digression: syntax 


1. The most important feature that most artificial languages have in 
common is the ability to encompass a rich spectrum of modes of expres- 
sion starting with a small finite number of generating principles. 

In each concrete case the choice of these principles (including the 
alphabet and syntax) is based on a compromise between two extremes. 
Economical use of modes of expression leads to unified notation and 
simplified mechanical analysis of the text. But then the texts become much 
longer and farther removed from natural language texts. Enriching the 
modes of expression brings the artificial texts closer to the natural lan- 
guage texts, but complicates the syntax and the formal analysis. (Compare 
machine languages with such programming languages as Algol, Fortran, 
Cobol, etc.) 

We now give several examples based on our material. 


2. Dialects of ©, 

(a) Without changing the logic in £°,, it is possible to discard parentheses 
and either of the two quantifiers from the alphabet, and to replace all the 
connectives by one, namely | (conjunction of negations). (In addition, 
constants could be declared to be functions of degree 0, and functions 
could be interpreted as relations.) 

This is accomplished by the following change in the definitions. If 
t,,...,¢, are terms, f is an operation of degree r, and p is a relation of 
degree r, then ft, + +: ¢, isa term, and pt, - - - ¢, is an atomic formula. If P 
and Q are formulas, then | PQ and VxP are formulas. The content of | PQ 
is “not P and not Q,” so that we have the following expressions in this 
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Digression: syntax 


dialect: 
-(P): | PP 
(P)A(Q): WPPLQQ 
(P)V(Q): LL PQ) PQ 


Clearly, economizing on parentheses and connectives leads to much repeti- 
tion of the same formula. Nevertheless, it may become simpler to prove 
theorems about such a language because of the shorter list of syntactic 
norms. 

(b) Bourbaki’s language of set theory has an alphabet consisting of the 
signs (],7, \/, 1, =, © and the letters. Expressions in this language are 
not simply sequences of signs in the alphabet, but sequences in which 
certain elements are paired together by superlinear connectives. For exam- 
ple: 


Ir ~ ed 


m/71E DAE QA". 


The main difference between Bourbaki’s language and L,Set is the use of 
the “Hilbert choice symbol.” If, for example, € xy is the formula “x is an 
element of y,” then 


TE OQ yp 


is a term meaning “some element of the set y.” 

Bourbaki’s language is not very convenient and is not widely used. It 
became known in the popular literature thanks to an example of a very 
long abbreviated notation for the term “one,” which the authors impru- 
dently introduced: 


7z((Au(AU)(u =(U, {0}, Z) AU C{O} x ZA(Wx)((x € (B}) 
=> (Ay)((x, ») € U)) A(Wx)(Vy)(Wy’)(((x,» € UA (x,y) € U) 
=(y =») A(y)((y € Z)=(Ax)((x,y) € U))))). 


It would take several tens of thousands of symbols to write out this term 
completely; this seems a little too much for “one.” 

(c) A way to greatly extend the expressive possibilities of almost any 
language in ©, is to allow “class terms” of the type {x|P(x)}, meaning 
“the class of all objects x having the property P.” This idea was used by 
Morse in his language of set theory and by Smullyan in his language of 
arithmetic; see §10 of Chapter II. 
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3. General remarks. Most natural and artificial languages are characteristi- 
cally discrete and linear (one-dimensional). On the one hand, our percep- 
tion of the external world is not felt by us to be either discrete or linear, 
although these characteristics are observed on the level of physiological 
mechanisms (coding by impulses in the nervous system). On the other 
hand, the languages in which we communicate tend to transmit informa- 
tion in a sequence of distinguishable elementary signs. The main reason for 
this is probably the much greater (theoretically unlimited) uniqueness and 
reproducibility of information than is possible with other methods of 
conveyance. Compare with the well-known advantages of digital over 
analog computers. 

The human brain clearly uses both principles. The perception of images 
as a whole, along with emotions, are more closely connected with nonlin- 
ear and nondiscrete processes—perhaps of a wave nature. It is interesting 
to examine from this point of view the nonlinear fragments in various 
languages. 

In mathematics this includes, first of all, the use of drawings. But this 
use does not lend itself to formal description, with the exception of the 
separate and formalized theory of graphs. Graphs are especially popular 
objects, because they are as close as possible both to their visual image as a 
whole and to their description using all the rules of set theory. Every time 
we are able to connect a problem with a graph, it becomes much simpler to 
discuss it, and large sections of verbal description are replaced by manipu- 
lation with pictures. 

A less well-known class of examples is the commutative diagrams and 
spectral sequences of homological algebra. A typical example is the “snake 
lemma.” Here is its precise formulation. 

Suppose we are given a commutative diagram of abelian groups and 
homomorphisms between them (in the box below), in which the rows are 
exact sequences: 


0 ——» Ker f ———» Ker g ———»> Kerh --------- : 


Se see » Coker f ——» Coker g ——-»> Coker h-——> 0 


Then the kernels and cokernels of the “vertical” homomorphisms f, g, A 
form a six-term exact sequence, as shown in the drawing, and the entire 
diagram of solid arrows is commutative. The “snake” morphism Ker h— 
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Coker f, which is denoted by the dotted arrow, is the basic object con- 
structed in the lemma. 

Of course, it is easy to describe the snake diagram sequentially in a 
suitable, more or less formal, linear language. However, such a procedure 
requires an artificial and not uniquely determined breaking up of a clearly 
two-dimensional picture (as in scanning a television image). Moreover, 
without having the overall image in mind, it becomes harder to recognize 
the analogous situation in other contexts and to bring the information 
together into a single block. 

The beginnings of homological algebra saw the enthusiastic recognition 
of useful classes of diagrams. At first this interest was even exaggerated; 
see the editor’s appendix to the Russian translation of Homological Algebra 
by Cartan and Eilenberg. 

There is one striking example of an entire book with an intentional 
two-dimensional (block) structure: C. H. Lindsey and S. G. van der 
Meulen, Informal Introduction to Algol 68 (North-Holland, Amsterdam, 
1971). It consists of eight chapters, each of which is divided into seven 
sections (eight of the 56 sections are empty, to make the system work!). Let 
(i, /) be the name of the jth section of the ith chapter; then the book can 
be studied either “row by row” or “column by column” in the (i, /)-matrix, 
depending on the reader’s intentions. 

As with all great undertakings, this is the fruit of an attempt to solve 
what is in all likelihood an insoluble problem, since, as the authors remark, 
Algol 68 “is quite impossible to describe ... until it has been described.” 
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CHAPTER II 


Truth and deducibility 


1 Unique reading lemma 


The basic content of this section is Lemma 1.4 and Definitions 1.5 and 1.6. 
The lemma guarantees that the terms and formulas of any language in &, 
can be deciphered in a unique way, and it serves as a basis for most 
inductive arguments. (The reader may take the lemma on faith for the time 
being, provided that he was able independently to verify the last formula 
in 3.7 of Chapter I. However, the proof of the lemma will be needed in §4 
of Chapter VII.) It is important to remember that the theory of any formal 
language begins by checking that the syntactic rules are free of ambiguity. 

We begin with the standard combinatoric definitions, in order to fix the 
terminology. 


1.1. Let A be a set. By a sequence of length n of elements of A we mean a 
mapping from the set {1,..., m} to A. The image of i is called the ith term 
of the sequence. Corresponding to n =0 we have the empty sequence. 
Sequences of length 1 will sometimes be identified with elements of A. 

A sequence of length » can also be written in the form q,..., 


a;,..., 4,, where a, is its ith term. The number i is called the index of the 
term a,. If P=(a,,...,a,) and O=(b,,..., 6,,) are two sequences, their 
concatenation PQ is the sequence (a),...,4,, 5),..-., 5,,) of length m+n 


whose ith terms is a, for i < n and 6,_, forn+1<i<n+m. We similarly 
define the concatenation of a finite sequence of sequences. 

An occurrence of the sequence Q in P is any representation of P as a 
concatenation P,QP,. Substituting a sequence R in place of a given 
occurrence of Q in P amounts to constructing the sequence P, RP). 


20 


1 Unique reading lemma 


Let II*, II~ be two disjoint subsets of {1,...,}. Amape : II* >II7 
is called a parentheses bijection if it is bijective and satisfies the conditions: 


(a) c(i) > i for alli EII*; 
(b) for every i and j,  € [ i, c(i) | if and only if c(j) € [i, e(é) ]. 


1.2. Lemma. Given II*+ and II-, if a parentheses bijection exists, then it is 
unique. 


This lemma will be applied to expressions in languages in £,: II* will 
consist of the indices of the places in the expression at which “(” occurs, 
II~ will consist of the indices of the places at which “)” occurs, and the 
map c correlates to each left parenthesis the corresponding right parenthe- 
sis. 


PROOF OF THE LEMMA. Let the function e: {1,...,m}—{0, + 1} take the 
value 1 on JI*, —1 on II~, and 0 everywhere else. We claim that for every 
i €IIt, for any parentheses bijection c : It >II-, and for any k, 1 < k 
< c(i) — i, we have the relations: 


e(4) c(i)—k 
> e(/) =0,  e({)>0. 
ji ji 


The lemma follows immediately from these relations, since we obtain 
the following recipe for determining c from II* and II”; c(i) is the least 
! >i for which Xj. ,e(j) = 0. 

The first relation holds because the elements of II* and II~ which 
appear in the interval [ i, c(‘) | do so in pairs (j, c(j)), and e(/) + e(c(J)) 
= 0. 

To prove the second relation, suppose that for some i and & we have 
507 *e(/) < 0. Since e(i) = 1, it follows that 260,4e(j) <0. Hence, the 
number of elements of II~ in the interval [i +1, c(i)~—k] is strictly 
greater than the number from II*. Let c(j)) € I1~ be an element in the 
interval such that jp @[i+1,c(i)—k]. Then j, <i, and in fact, jp <i, 
since c(i) is outside the interval. But then only one element of the pair /o, 
c(jo) lies in [ i, c(i) ], which contradicts the definition of c. Oo 


1.3. Now let A be the alphabet of a language L in &, (see §2 of Chapter I). 
Finite sequences of elements of A are the expressions in this language. 
Certain expressions have been distinguished as formulas or terms. We 
recall that the definitions in §2 of Chapter I imply that: 

(a) Any term in L either is a constant, is a variable, or is represented in 


the form f(¢,,...,¢,), where f is an operation of degree r, and 4,,..., 4, 
are terms shorter in length. 

(b) Any formula in L is represented either in the form p(t),..., 4), 
where p is a relation of degree r and t,,..., ¢, are terms shorter in length, 
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or in one of the seven forms 


(P)=(Q), (P)=(Q), (P)V(Q), (P)A(Q), 
(PP), Wx(P), Ax(P), 


where P and Q are formulas shorter in length, and x is a variable. 

The following result is then obtained by induction on the length of the 
expression: if E is a term or a formula, then there exists a parentheses 
bijection between the set TI* of indices of left parentheses in E and the set 
II” of indices of right parentheses. In fact, the new parentheses in 1.3(a) and 
(b) have a natural bijection, while the old ones (which might be contained 
in the terms ¢,,..., ¢, or the formulas P, Q) have such a bijection by the 
induction assumption. In addition, the new parentheses never come be- 
tween two paired old parentheses. 

We can now state the basic result of this section: 


1.4. Unique Reading Lemma. Every expression in L is either a term, or a 
formula, or neither. These alternatives, as well as all of the alternatives 
listed in 1.3(a) and (b), are mutually exclusive. Every term (resp. formula) 
can be represented in exactly one of the forms in 1.3(a) (resp. 1.3(b)), and 
in a unique way. 

In addition, in the course of the proof we show that, if an expression 
is the concatenation of a finite sequence of terms, then it is uniquely 
representable as such a concatenation. 


Proor. Using induction on the length of the expression £, we describe an 
informal algorithm for syntactic analysis, which uniquely determines which 
alternative holds. 

(a) If there are no parentheses in £, then E is either a constant term, a 
variable term, or neither a term nor a formula. 

(b) If E contains parentheses, but there is no parentheses bijection 
between the left and right parentheses, then £ is neither a term nor a 
formula. 

(c) Suppose F contains parentheses with a parentheses bijection. Then 
either E is uniquely represented in one of the nine forms 


f(E,) (where fis an operation), 
p(E,) (where p is a relation), 
(E, )e(Z, ) (E,)=(E,), (E,)V (Ez), (E, JAE), 

(a )5 Vx(E, Je “SxCEs), 
or else E is neither a term nor a formula. Here the pairs of parentheses we 
have written out are connected by the unique parentheses bijection which 
is assumed to exist in FE; this is what ensures uniqueness. In fact, we obtain 
the form f(£) if and only if the first element of the expression is a 


function, the second element is “(,” and the last element is the “)’” which 
corresponds under the bijection: and similarly for the other forms. 
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We have thereby reduced the problem to the syntactic analysis of the 
expressions Ep, E,, E,, E;, which are shorter in length. This almost com- 
pletes our description of the algorithm, since what remains to be de- 
termined about E,, E,, E, is whether or not they are formulas. However, 
for Ey we must determine whether this expression is a concatenation of the 
right number of terms, and we must ask whether such a representation 
must be unique. 

The answer to the latter question is positive. We have the following 
recipe for breaking off terms from left to right in a union of terms. 

(d) Let Ey be an expression having a parentheses bijection between its 
left and right parentheses. If Ey can be represented in the form /£ , where t 
is a term, then this representation is unique. In fact, either EZ, can be 
uniquely represented in one of the forms 


xEo, cEo, f( Ey )Es 


(where x is a variable, c is a constant, and f is an operation whose 
parentheses correspond under the unique parentheses bijection in E,), or 
else E, cannot be represented in the form t£g, where ¢ is a term. In the 
cases E, = xEj or Ey = cEQ, this is obviously the only way to break off a 
term from the left. In the case E,=f(£,)£ 4, the question reduces to 
whether or not E,’ is a concatenation of degree (f) terms. By induction on 
the length of Ey, we may assume that either Ey is not such a concatena- 
tion, or else it is uniquely representable as a concatenation of terms. The 
lemma is proved. Oo 


EXERCISE: State and prove a unique reading lemma for the “parentheses-less” 
dialect of ©, described in 2(a) of “Digression: Syntax” in Chapter I. 


Here is the first inductive description of the difference between free and 
bound occurrences of a variable in terms and formulas. The correctness of 
the following definitions is ensured by Lemma 1.4. 


1.5. Definition. 

(a) Every occurrence of a variable in an atomic formula or term is 
free. 

(b) Every occurrence of a variable in —(P) or in (P,) * (Pz) (where 
* is any of the connectives “\/,” “A,” “=>” or “«”) is free (respec- 
tively bound) if and only if the corresponding occurrence in P, P,, or P, 
is free (respectively bound). 

(c) Every occurrence of the variable x in Vx(P) and 4x(P) is bound. 
The occurrences of other variables in Wx(P) and 4x(P) are the same as 
the corresponding occurrences in P. 


Suppose the quantifier V (or 3) occurs in the formula P. It follows from 
the definitions that it must be followed in P by a variable and a left 
parenthesis. The expression which begins with this variable and ends with 
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the corresponding right parenthesis is called the scope of the given (oc- 
currence of the) quantifier. 


1.6. Definition. Suppose we are given a formula P, a free occurrence of the 
variable x in P, and a term ft. We say that ¢ is free for the given 
occurrence of x in P if the occurrence does not lie in the scope of any 
quantifier of the form dy or Vy, where y is a variable occurring in f¢. 


In other words, if ¢ is substituted in place of the given occurrence of x, 
all free occurrences of variables in ¢ remain free in P. 

We usually have to substitute a term for each free occurrence of a given 
variable. It is important to note that this operation takes terms into terms 
and formulas into formulas (induction on the length). If ¢ is free for each 
free occurrence of x in P we simply say that ¢ is free for x in P. 


1.7. We shall start working with definitions 1.5 and 1.6 in the next section. 
Here we shall only give some intuitive explanations. 

Definition 1.5 allows us to introduce the important class of closed 
formulas. By definition, this consists of formulas without free variables. 
(They are also called sentences.) The intuitive meaning of the concept of a 
closed formula is as follows. A closed formula corresponds to an assertion 
which is completely determined (in particular, regarding truth or falsity); 
indeterminate objects of the theory are only mentioned in the context “all 
objects x satisfy the condition...” or “there exists an object » with the 
property... .” Conversely, a formula which is not closed, such as x € y 
or 4x(x © y), may be true or false depending on what sets are being 
designated by the names x and y (for the first) or by the name y (for the 
second). Here truth or falsity is understood to mean for a fixed interpreta- 
tion of the language, as will be explained in §2. 

In particular, Definition 1.6 gives the rules of hygiene for changing 
notation. If we want to call an indeterminate object x by another name y 
in a given formula, we must be sure that x does not appear in the parts of 
the formula where this name y is already being used to denote an arbitrary 
indeterminate object (after a quantifier). In other words, y must be free for 
x. Moreover, if we want to say that x is obtained from certain operations 
on other indeterminate objects (x = a term containing y,,...,y,), then the 
variables y,,...,, must not be bound. 

There is a close parallel to these rules in the language of analysis: 
instead of ff f(y») dy we may confidently write {{ f(z) dz but we must not 
write {7 f(x) dx; the variable y is bound in the scope of f f(y) ay. 


2 Interpretation: truth, definability 


2.1. Suppose we are given a language L in £, and a set (or class) M. To 
give an interpretation of L in M means to tell how a formula in L can be 
given a meaning as a statement about the elements of M. 
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More precisely, an interpretation @ of the language L in M consists of a 
collection of mappings which correlate terms and formulas of the language 
to elements of M and structures over M (in the sense of Bourbaki). These 
mappings are divided into primary mappings, which actually determine the 
interpretation, and secondary mappings, which are constructed in a natural 
and unique way from the primary mappings. We shall use the term 
interpretation to refer to the mappings themselves, and sometimes also to 
the values they take. 

Let us proceed to the systematic definitions. We shall sometimes call the 
elements of the alphabet of L symbols. The notation ¢ for the interpreta- 
tion will either be included when writing the mappings or omitted, depend- 
ing on the context. 


2.2. Primary mappings 

(a) An interpretation of the constants is a map from the set of symbols 
for constants (in the alphabet of L) to M, which takes a symbol c to 
o(c)E M. 

(b) An interpretation of the operations is a map from the set of symbols 
for operations {in the alphabet of L) which takes a symbol f of degree r to 
a function ¢(f) on M x --- X M=M’ with values in M. 

(c) An interpretation of the relations is a map from the set of symbols 
for relations (in the alphabet of L) which takes a symbol p of degree r to a 
subset ¢(p) C M”. 


Secondary mappings Intuitively, we would like to interpret variables as 
names for the “generic element” of the set M, which can be given specific 
values in M. We would like to interpret the term f(x,,...,x,) as a 
function $(f) of r arguments which run through values in M, and so on. 
__In order to give a precise definition, we introduce the interpretation class 
M: 
M = the set of all maps to M from the set of symbols for variables 
in the alphabet of L. 

Thus, every point €€ M correlates to any variable x a value $(x)(g) € M, 
which we shall usually denote simply x‘. This allows us to consider 
variables as functions on M with values in M. More generally: 


2.3. The interpretation of terms correlates to each term ¢ a function $(¢) on 
M with values in M. This correspondence is defined inductively by the 
following compatibilities: 


(a) If c is a constant, then ¢$(c) is the constant function whose value is 
defined by the primary mapping. 
(b) If x is a variable, then (x) is o(x)(E)_ as a function of &. 
(c) If c= f(4,,..., 4), then for all €& M 
(2)(£) = Hf (O(4)E), - -- . (48), 


where the $(¢,)() are defined by the induction assumption, and 
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o(f) : M"— M is given by the primary mapping. Instead of $(7)(6) we 
shall sometimes write simply /¢. 
2.4. Interpretation of atomic formulas. An interpretation ¢ assigns to every 
formula P in L a truth function |P|,. This is a function on the interpreta- 
tion class M which only takes the values 0 (“false”) and 1 (“true”). It is 
defined for atomic formulas as follows: 


1, if (#,..., 4) €6(p), 
Pep lea) > Tn eee) 
0, otherwise. 
Intuitively, a statement p about the names ¢,,..., 4, for objects in M 
becomes true if the objects named by 7,,..., ¢, satisfy the relation named 
by p. 
2.5. Interpretation of formulas. The truth function for nonatomic formulas 


is defined inductively by means of the following relations (for brevity, we 
have omitted parentheses and explicit mention of ¢ and &): 


|P=>O|=|P||O|+(U ~|P)C —|@)): 
P= (Q is true when either P and Q are both true or P and Q are both false. 
|P=>Q|=1—|P|+|P| |Ql: 
P= (Q is only false when P is true and Q is false. 
[PV Q| = max(|P|, |Q|): 
P\/ Q is only false when P and Q are both false. 
|P/\ Q| = min(|PI, |Q|): 
P A Q is only true when P and Q are both true. 
| aP| =1— |PI: 
+P is only false when P is true. 
Finally, we must describe what happens when quantifiers are in- 
troduced. Suppose that £ € M and x is a variable. By a variation of & along 


x we mean any point & & M for which y* = y*’ whenever y is a variable 
different from x. Then 


IWxP|() = min |P\(€’). 
|AxP|(g) = max | P|(é’), 


where é’ runs through all variations of € along x. a 

A formula P is called ¢-srue if |P|,(§) = | for all € € M. The interpreta- 
tion ¢ (or M) is called a model for a set of formulas & if all the elements of 
& are ¢-true. 


2.6. EXAMPLE: STANDARD INTERPRETATION OF LAr. This is the interpreta- 
tion in the set N of nonnegative integers, in which 0, | are interpreted as 
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0, 1, respectively, and +, -, = are interpreted as addition, multiplication, 
and equality, respectively. 


2.7. EXAMPLE: STANDARD INTERPRETATION OF L,Set. This is the interpreta- 
tion in the von Neumann universe V, in which @ is interpreted as the 
empty set, € is interpreted as the relation “is an element in,” and = is 
interpreted as equality. 


All of the examples of translations in Chapter I were based on these 
standard interpretations. The relationship between those examples and the 
above definitions is as follows. Let II(x, y, z) be a statement in argot about 
the indeterminate sets x, y, z in V; and let P(x, y, z) be a translation of I 
into the language L,Set. Then for any point & interpreting x, y, z as the 
names of sets x®, y’, zé in the von Neumann universe, we have: 


TI(xé, yé, z*) is true | P(x, y, z)\(é) = 1. 
Thus, every formula expresses, or defines, a property of objects in the 
interpretation set: 


2.8. Definition. A set S C M’, r > 1, is called o-definable (by the formula P 
in L with the interpretation ¢) if there exist variables x,,..., x, such 
that 


[PI =le<xé,..., x ES 
for all € in M. 
One of the most important problems concerning formal languages is to 
understand the structure of the sets of 


¢-true formulas in L; 


o-definable sets in _) M’. 


ral 


2.9. EXAMPLE. The sets definable by means of L,Ar with the standard 


interpretation constitute the smallest class of sets in U a ,N’ which 


(a) contains all sets of the form 
(kia BP OG: ig KS OVEN, 


where F runs through all polynomials with integral coefficients. 

(b) is closed relative to finite intersections, unions, and complements (in 
the appropriate NV’) 

(c) is closed relative to the projections pr, : N’> N’~!: 


De kes OR Poa Gow antici 
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In fact, sets of type (a) are defined by atomic formulas of the form 
ty) = tf, where 7} is a term corresponding to the sum of the monomials in 
F with positive coefficients, and iy corresponds to the sum of the monomi- 
als with negative coefficients. Further, if S,,S,cCN” are definable by 
formulas P,, P, (with the same variables), then S,M S, is definable by 
P, /A\ Py, S;U S218 definable by P,\V P,, and N’\ S, is definable by -P,. 
Finally, the set pr,(.S,) is definable by the formula 4x,(P,). The connec- 
tives = and = and the quantifier V give nothing new, since, without 
changing the set being defined, we may replace them by combinations of 
the logical operations already discussed: Vx may be replaced by “4x —, 
and so on. 


This first description of arithmetical sets, i.e., L,Ar-definable sets, will be 
greatly amplified in the second and third parts of the book. At this point it 
is not immediately clear how to develop the subtler properties of definabil- 
ity, such as the definability of the set of prime numbers in N (see example 
3.14 in Chapter I), the definability of the set of partial fractions in the 


continued fraction expansion of Vo: . or the definability of the set of pairs 
{<i, ith digit in the decimal expansion of 7>} Cc N?. 


However, as we shall see in §11 and in Chapter VII, the ““G6del numbers 
of the true formulas of arithmetic” form still a much more complicated set, 
and this set is not definable. 

We now give several simple technical results. 


2.10. Proposition. Let P be a formula in L, an interpretation in M, and 
£ & © M. Suppose that x* coincides with x*' for all variables x occurring 
freely in P. Then |P|,(€) = |P|,(€). 


2.11. Corollary. Zn any interpretation the closed formulas P have well-defined 
truth values: |P|,(€) does not depend on &. 


PROOF. 
(a) Let ¢ be a term, and suppose that for any variable x in ¢ we have 
x§ = x*. Then Lemma 1.4 and induction on the length of ¢ give rf = 7. 
(b) Assertion 2.10 holds for atomic formulas P of the form p(t), ..., ¢,). 
In fact, 


1, if Ce§,..., 48> Eo(P), 
|P\(é) = a SS EG(P) 

0, otherwise, 
and similarly for |P|(é’). But if € and £’ coincide on all the variables in P 
(all of which occur freely), then a fortiori they coincide on all the variables 
in 4, and, by part (a), we have “= ¢*, i=1,...,7r. Therefore, |P\(é) = 
PI). 
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(c) We now use induction on the total number of connectives and 
quantifiers in P. If P has the form —@Q or Q, * Q,, then 2.10 for P follows 
trivially from 2.10 for Q, Q,, Q,. Now suppose that P has the form Vx(Q), 
and that 2.10 holds for Q. (The case 4x(Q) can be treated analogously or 
can be reduced to the case Vx by replacing dx by Wx —.) By defini- 
tion, we have 


|WxO|(é) = | 1, if |2\(n) = | for all variations y of € along x, 
, otherwise; 


IWxQ |(£) = 1. at lO\(n’) = | for all variations 7’ of £ along x, 
, otherwise. 


On the right we may let 7 and 7’ vary in addition on all variables which do 
not occur freely in Q. The assertions after the word “if” remain true or 
false in this wider range of values if they were true or false before, by the 
induction hypothesis on Q. But then 7 and 7’ run through the same values, 
because € and é’ only differ on variables which do not occur freely in Q, 
and on x. The proposition is proved. oO 


The following almost obvious fact is the basis for many phenomena 
which attest to the inadequacy of formal languages for completely describ- 
ing intuitive concepts (see “Skolem’s paradox” below): 


2.12. Proposition. The cardinality of the class of -definable sets does not 
exceed 


card(alphabet of L) + Np. 


Here and below, by “card(alphabet of L)” we mean the cardinality of the 
alphabet of L without the set of variables. 


ProoF. If the language has < x, variables, then there are at most 
card(alphabet of L) + X, formulas. 


If, on the other hand, it has an uncountable set of variables, then we note 
that every definable set can be defined by a formula whose variables 
belong to a fixed countable subset of the variables which is chosen once 
and for all. oO 


2.13. Corollary. If M is infinite and card(alphabet of L) < 2°4™, then 
“almost all” sets are undefinable. 


Thus, the only way to define all subsets of M is to include a tremendous 
number of names in the language. For languages which are to describe 
actual mathematical reasoning this is an unrealistic program. Essentially, 
any finitely describable collection of modes of expression only allows us to 
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define a countable number of sets. However, it is often technically useful 
to include in the alphabet, for example, names for all the elements of M. 

In the following sections we proceed to study systematically sets of true 
formulas. 


3 Syntactic properties of truth 


Let L be a language in £,, let @ be an interpretation of L, and letT,L be 
the set of ¢-true formulas. In this section we list some properties of T,L 
which reflect the logic inherent in languages of £, regardless of the specific 
nature of the interpretation ¢. 


3.1. The set TL is complete. By definition, this means that, for any closed 
formula P, either P or —P lies in T,L. This property follows from 
Corollary 2.11 above. 


3.2. The set T,L does not contain a contradiction, that is, there is no 
formula P for which P and —P both lie in T,L. In fact, T,L = {P| |P|, 
= 1}, while | “P|, =1—-|P\,. 


3.3. The set T,L is closed under the rules of deduction MP (modus ponens) 
and Gen ( generalization). By definition, this means that, if P and P= Q lie 
in Lghs then Q also lies in TL; and that, if P lies in T,L, then VxP lies in 
T,L for any variable x. The verification is immediate: if |P|,=1 and 
|P=> Q|, = 1, then we must have |Q|, = 1; if |P|,() = 1 for all & then also 
|VxP|,(&) = 1. The formula Q is called a direct consequence of the formulas 
P and P= OQ using the rule of deduction MP. The formula VxP is called a 
direct consequence of the formula P using the rule of deduction Gen. 

The intuitive meaning of these rules of deduction is as follows. The rule 
MP corresponds to the type of argument: “If P is true, and if the truth of 
P implies the truth of Q, then Q is true.” Thus, one might say that the 
semantics of the expression “if... then” in natural languages is divided 
between the semantics of the connective = and the semantics of the rule of 
deduction MP in languages of £,. Neglecting this point of view often leads 
to confusion when one attempts to explain the rules for assigning truth 
values to the formula P= Q. 

The rule Gen corresponds to the practice in mathematics of writing 
“identities” or universally true assertions. When we write (a + b)? = a? + 
2ab + b? or “in a right triangle the square of the hypotenuse is equal to the 
sum of the squares of the other two sides,” the quantifiers Va Vb and 
V triangles are omitted. Putting the quantifiers back in does not change the 
truth values, and has the advantage of freeing the notation for later use. 


3.4. The set T,L contains all tautologies. To define what a tautology is, we 
first introduce the notion of a logical polynomial over a set of formulas &. 
This is an element in the least set of formulas which contains & and is 
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closed with respect to constructing formulas from shorter formulas using 
logical connectives. 

A sequence of formulas P,,..., P, and representations of each P; 
either in the form Q, where Q € &, or in the form —@Q or Q, * Q), where 
Q, O,, Q; lie in {P,,..., P,_1} is called a representation of P, as a logical 
polynomial over &. The representation of P, is not necessarily unique: for 
example, if & = {P, QO, PQ}, then P=@Q has two representations. 

Let | |: & {0,1} be any map. If we are given a representation r of 
the formula P, as a logical polynomial over &, then we can use the 
formulas in 2.5 to determine |P,,|, recursively. 

A formula P is called a tautology if there exists a set of formulas & anda 
representation r of P as a logical polynomial over & such that |P|, = 1 for all 
maps | |: © {0,1}. The property of being a tautology is effectively 
decidable, since, by syntactically analyzing P we can enumerate all repre- 
sentations of P as a logical polynomial. All tautologies obviously belong to 
T,L. 

per are our first examples of tautologies: 


AO. P=P 

Al. P3>(Q=>P) 

A2. (P=(Q=>R))>((P=>Q)>(P=R)) 
A3. (MQ0= 4P)3((70>P)30) 

Bl. 4 QP>P,P> 7 “ae 

B2. “AP>(P=>9Q). 


Here P, Q, and R are arbitrary formulas in L; the form in which these 
tautologies are written makes it clear what representation as a logical 
polynomial over {P, Q, R} is intended. 

Thus, tautologies are formulas which are true regardless of the truth or 
falsity of the component parts (if the notion of component is suitably 
chosen). B1 is the law of the excluded middle: a double negation is 
equivalent to the original assertion. B2 is the mechanism by which a 
contradiction in a set of formulas & in L leads to the deducibility of any 
formula, and thereby destroys the entire system. (See Proposition 4.2 
below.) 


EXAMPLE OF HOW A TAUTOLOGY IS VERIFIED. We give three versions of how 
to verify that the simple formula A] is a tautology. 
Version (a). By the formulas in 2.5, we have 


|P>(Q=>P)|=1—|P|+|P||Q=P| 
=1—|P|+|P\d—|e@l+|P| le) =1, 


since | P|? =| P|. 
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Version (b). We tabulate |P=>(Q= P)| as a function of |P| and |Q|;: 


|P| |Q| |Q=> P| |P>(Q=>P)| 
0 0 l I 
0 I 0 1 
l 0 J I 
1 I 1 I 


This is an example of a “truth table.” 

Version (c). The basic property of the connective = is that P=Q is 
only false if P is true and Q is false. If P=>(Q= P) were false, then P 
would be true and Q=>P would be false; then, in turn, Q would be true 
and P would be false, a contradiction. 

The reader would do well to verify that the more complicated axioms, 
for example A2, are tautologies, and to decide which of the three versions 
he prefers. 


3.5. The set T,L contains the “logical quantifier axioms,” that is, the 
formulas 


(a) Vx(P=>Q)=3(P=>VxQ), if all the occurrences of x in P are bound. 

(b) Vx AP 4 xP. 

(c) VxP(x)= P(t), if ¢ 1s free for x in P (axiom of specialization). Here we 
use the notation P(r) for the result of substituting ¢ for each free 
occurrence of x in P. In all other respects P and Q are arbitrary 
formulas. 


In 3.7 we verify that the formulas in 3.5 are -true. The intuitive 
meaning of these formulas is more or less clear. For example, the axiom of 
specialization means that, if P(x) is true for all x, then P(t) is also true, 
where ¢ is the name of any object. The condition that ¢ must be free for x 
is the rule of hygiene for changing notation. 

The set 


Ax L = {tautologies of L} U {quantifier axioms} 


is called the set of logical axioms in the language L. 

A set of formulas & in L will be called Gédelian if it is complete, does not 
contain a contradiction, is closed with respect to the rules of deduction MP and 
Gen, and contains all the logical axioms of L. The basic conclusion of our 
discussion is then: 


3.6. Proposition. The set of true formulas of L (in any interpretation) is 
Godelian. 


In §6 we prove that, conversely, any Gdédelian set is a set of true 
formulas in a suitable interpretation. Thus, the concept of a Godelian set is 
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the closest approximation to the concept of truth which can be attained 
“without regard to meaning.” 


3.7. Verification that axioms 3.5 are true. 

(a) Let R be the formula 3.5(a). We suppose that |R|(€) =0 for some 
€€ M and show that that leads to a contradiction. 

In fact, then |Vx(P=>Q)|(é)=1 and |P=Vx Q|()=0. The second 
equation implies that |P|(é) = 1 and |Vx Q|(§ =0. Let &’ be a variation of 
€ along x for which |Q|(é’) =0. Then |P|(é’) =|P\(§) = 1 by Proposition 
2.10, since x does not occur freely in P. Hence, |P>Q|(é) =0, which 
contradicts the relation |Vx(P=> Q)|(§ = 1. 

(b) For all € € M and for all variations ¢’ of € along x, we have 


[Wx PI(g) = max| 4P|(@) = 1 ~ min| P(e); 
| 73x P|(G) = 1 ~ min PIC). 


Hence, the truth values of Vx —P and —4x P coincide, so that Vx 4P 
<= 4x P is identically true. _ 

(c) Suppose that |Vx P(x)= P(f|(€) =0 for some point € M. We 
show that this leads to a contradiction. In fact, then 


Wx P(x)(Q=1, | P(a|(g) =. 

The first equation implies that | P (x)|(é) = 1 for all variations é’ or € along 
x. For & we take the variation such that x* = 75. If we prove that 
|P(2)|() = | P(x)|(€), then we obtain the desired contradiction. 

We prove this by induction on the total number of connectives and 
quantifiers in P. ;: 

(c,) Let P be an atomic formula p(t),...,¢,). Letting 1; denote the 
result of substituting ¢ for each occurrence of x in ¢,, we successively 
obtain: 


t= x* (by the definition of ¢’), 
#8 = 1 — (by induction on the length of ¢,), 


[P(x)(E) = lp. QE) = [p(s IO) = |POIE. 


(c,) Let P have the form —Q or Q,;}> Q), where }>_ is a connective. 
Since x does not bind ¢ in P by assumption, the same is true for Q, Q,, and 
Q,, and the necessary induction step is automatic. 

(c3) Finally, let P have the form Ay Q or Vy Q. We shall examine the 
first case; the proof for the second case is analogous. 

Subcase I. y = x. Then x is bound in P; therefore, P(x) = P(t), and 
|P\( =|P|(é’) by Proposition 2.10. 

Subcase 2. y # x. The induction assumption has the form: |Q(f)|(y) = 
|QO(x)|(7’), if 7 is any point in M and 7’ is a variation of y along x for 
which x” = 7". We must show that the following two truth values coincide 
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(where € and @ are defined as above): 


lay O(x)I(E) = 1, if IQ (I(r) = | for some variation 7’ of &’ along y, 
0, otherwise. 


lay O(D\(H = lif IQ) = ] for some variation 7 of é along y, 
, otherwise. 


We recall that é is the variation of € along x for which x® = £. a 

We first suppose that the second truth value is 1. We choose y € M so 
that |Q(z)|(m) = 1, and then construct the variation 7’ of » along x for 
which x” =1". Then, by the induction assumption, 1 =|Q(1)\(m) = 
|Q(x)\(7’). We show that 7’ is a variation of & along y; this will imply that 
the first truth value is also 1. In fact, n’ was obtained by varying y along x, 
n was obtained by varying € along y, and & was obtained by varying & 
along x. Hence, 7’ is a variation of é along x and y; we must show the 
variation along x did not actually take place: 

xT = xf, 
But the left-hand side is ¢” by the definition of 7’; the right-hand side is ¢* 
by the definition of £’; and y was obtained by varying £ along y. Since ¢ is 
free for x in P= Ay Q, it follows that y does not occur in ¢. 

It remains to verify that, if the second truth value is 0, then the first is 
also 0. The argument is almost the same. If the second truth value is 0, 
then |Q(t)|(n) =0 for all variations y of € along y. For each such 7 we 
construct 7’ as in the first part of the proof. As before, we verify that 7’ is a 
variation of £’ along y and, moreover, y’ runs through all such variations 
when 7 runs through all variations of € along y. Hence, the first truth value 
is also 0. 

The proposition is proved. | 


Digression: natural logic 


1. Logic does not concern itself with the external world, but only with 
systems for trying to understand it. The logic of one such sys- 
tem—mathematics—is normalized to such an extent that it resembles a 
rigid stencil, which we can attempt to impose on any other system. But 
whether or not this stencil fits the system should not be seen as the 
criterion of suitability or the measure of worth of the system. The physi- 
cist’s descriptions do not have to form a consistent or coherent whole; his 
job is to describe nature effectively on certain levels. Natural languages 
and the spontaneous workings of the mind are even less logical. In general, 
adherence to logical principles is only a condition for effectiveness in 
certain narrowly specialized spheres of human endeavor. 

Although comparisons between the logic of predicates and the logic of 
natural languages or their subsystems have no normative force, such 
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comparisons may be interesting and enlightening. Here we give some 
selected material from linguistics and psychology. 


2. B. Russell, K. Déhmann, H. Reichenbach, U. Weinreich, and many 
others have studied the problem of finding parallels in natural languages 
for categories which can be formalized in languages of £, and of catalogu- 
ing the methods of transmitting these categories. This leads to the grouping 
of words into so-called /ogico-semantic classes, instead of the traditional 
division into verbs, nouns, articles, etc. (A. V. Gladki and I. A. Mel’tuk, 
Eléments de linguistique mathématique, Paris, Dunod, 1972, §6). 

For example, the words sleeps, smart, cry-baby are parallel to relation 
symbols (predicates) of rank 1; the words loves, friendly, sister correspond 
to relations of rank 2. For each of them we have atomic formulas, such as 
“N sleeps,” “X is friendly to Y,” and so on. 

“All, sometimes, something” are quantifier words; while “and, or, but, 
if... then” are, of course, connectives. “The nose, le cadeau” are con- 
stants. Nouns are made into constants by using the definite article or its 
semantic equivalent. In Russian, which does not have definite articles, one 
must either use the demonstrative articles efot (this), fot (that), or make it 
clear from the context that the noun is meant as a constant. The words nos 
(nose), podarok (gift) are more like variables which stand for any object 
satisfying the simple predicate “is a nose,” “is a gift.” Incidentally, there 
are other possible interpretations. 

The pronoun “he” 1s, without doubt, a variable. The pronouns “I”? and 
“you” have much more complicated semantics, involving a correlation 
with who is speaking that does not exist in the speaker-less languages of ©,. 
Certain aspects of the first person pronoun are included in the semantics of 
algorithmic languages. The right type of “memory key” in a program for 
the IBM 360 will allow the program to change what is contained in any 
byte in the basic memory region. The memory guard asks “Who is there?”, 
and the program answers, “It is I.” Finally, it is even possible in languages 
of £, to find models for certain types of self-description; see §9-11 and the 
digression on self-reference. 

In Russian, “ili” (or) can be used not only to express the logical \/, but 
also to express the exclusive “or” and even to express conjunction A, as in 
the sentence “x* > 0 for x > 0 or for x < 0” (E. V. Paduéeva). In Latin, the 
functions of exclusive and inclusive ‘“‘or” are expressed by two different 
words, aut and vel. “And” can sometimes express a time sequence: 
compare the sentences “Jane got married and had a baby” with “Jane had 
a baby and got married” (S. Kleene). The conjunction /\ can be expressed 
in different languages by: 


juxtaposition: Chinese: ma mo—horse and donkey 

Swahili: shika kitabu usome—take a book and read 
a preposition: Russian: Petja s MaSei—Peter and Marsha 
a conjunction: and, i, et 
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a postpositional particle: Latin: senatus populusque—the senate and the 
people 
two conjunctions: Russian: kak ... tak. 


Doéhmann has catalogued the ways of expressing 16 logical polynomials 
in two variables in several languages of the world. 


3. Curious as all this material may be, it should be regarded critically; in 
such comparisons with logic, the subtleties of usage often elude us. As an 
example, let us analyze the natural semantics of “if... then.” We have 
already mentioned that in languages of £, this connective corresponds not 
only to “=,” but also to the rule of deduction modus ponens. Moreover, 
MP more adequately represents the meaning of “if... then.” 

Actually, the rule that any conditional is true if its antecedent is known 
to be false has almost no parallel in natural logic. Examples of the type “if 
snow is black, then 2 X 2=5,” which keep cropping up in textbooks, are 
only capable of confusing the student, since no natural subsystem in our 
language has expressions with this semantics. A possible exception is 
certain poetic and expressive formulas with extremely limited usage (“If 
she be false, O, then heaven mocks itself!”). Formal mathematics, in which 
a single contradiction destroys the entire system, clearly has the features of 
poetic hyperbole. 

Finally, in the logic of predicates there is no place at all for the modal 
aspect of the use of “if... then” in instructions of the type “if this 
happens, do that.” On the other hand, this aspect can easily be expressed 
by the semantics of the connective “if... then... else” in algorithmic 
languages such as Algol. Unless one uses techniques suggested by algorith- 
mic languages, any attempt to find a model for modality in languages 
based on £, is doomed to failure (compare: A. A. Ivin, The Logic of 
Norms, MGU Press, 1973). 


4. We have mentioned several times that the choice of the primitive modes 
of expression in the logic of predicates does not reflect psychological 
reality. Elementary logical operations, even one-step deductions, may 
require a highly trained intellect; yet, logically complicated operations can 
often be performed as a single elementary act of thought even by a 
damaged brain. 


“Sublieutenant Zasetsky, aged twenty-three, suffered a head injury 2 
March 1943 that penetrated the left parieto-occipital area of the cranium. 
The injury...was further complicated by inflammation that resulted in 
adhesions of the brain to the meninges and marked changes in the 
adjacent tissues.” 


Professor A. R. Luria met Zasetsky at the end of May 1943, and 
observed his condition for the next 26 years. In this time Zasetsky wrote 
nearly 3000 pages, describing with agonizing effort his life and illness as he 
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struggled to regain his reason. His notebooks, which provided the material 
for Luria’s book The Man with a Shattered World (Basic Books, Inc., New 
York, 1972, translated by L. Solotaroff), not only show his perseverance 
and determination, but are also revealing from a psychological point of 
view. 

At first, the destruction of Zasetsky’s psyche was overwhelming. The 
predominant disorder was asematia, the inability to connect symbols with 
their meaning. Luria describes his first meeting with Zasetsky: 


“*Try reading this page,’ I suggested to him. 

‘What’s this?...No, I don’t know...don’t understand...what is 
this?’.... 

I suggested he try to do something simple with numbers, like add six 
and seven. 

‘Seven...six...what’s it? No, I can’t...just don’t know.’” 

The ability to understand the simplest predicates was lost: “‘What 
season is there before winter?’ ‘Before winter? After winter?...Sum- 
mer?...Or something! No, I can’t get it.’ ‘Before spring?’ ‘It’s spring 
now...and...and before...I’ve already forgotten, just can’t remember.’” 

Zasetsky lost the ability to interpret the syntactic devices for organiz- 
ing meaning: “‘In the school where Dunya studied a woman worker from 
the factory came to give a report.’ What did this mean to him? Who gave 
the report—Dunya or the factory worker? And where was Dunya study- 
ing? Who came from the factory? Where did she speak?” 


This is a fairly difficult example composed by Professor Luria, but here 
is what Zasetsky himself writes: 


“T also had trouble with expressions like: ‘Is an elephant bigger than a 
fly?’ and ‘Is a fly bigger than an elephant?’ All I could figure out was that 
a fly is small and an elephant is big, but I didn’t understand the words 
bigger and smaller. The main problem was I couldn’t understand which 
word they referred to.” 


What attracts our attention is the complexity of Zasetsky’s metalinguis- 
tic text describing his linguistic difficulties. The subtlety of the analysis 
seems incompatible with the crude errors being analyzed. This could be 
explained by the retrospective nature of the analysis, but the following 
even more complicated description was written concurrently with the 
experience of the mental defect being described: 


“Sometimes [ll try to make sense out of those simple questions about 
the elephant and the fly, decide which is right or wrong. I know that when 
you rearrange the words, the meaning changes. At first I didn’t think it 
did, it didn’t seem to make any difference whether or not you rearranged 
the words. But after I thought about it a while I noticed that the sense of 
the four words (elephant, fly, smaller, larger) did change when the words 
were in a different order. But my brain, my memory, can’t figure out right 
away what the word smaller (or larger) refers to. So I always have to think 
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about them for a while... . So sometimes ridiculous expressions like ‘a fly 
is bigger than an elephant’ seem right to me, and I have to think about it 
a while longer.” 


We can also see how complicated mental abilities were preserved while 
“simple” ones were lost from examples of Zasetsky’s creative imagination, 
which resemble literary-psychological studies: 


“Say I’m a doctor examining a patient who is seriously ill. I’m terribly 
worried about him, grieve for him with all my heart. (After all, he’s 
human too, and helpless. I might become ill and also need help. But right 
now it’s him I’m worried about—I’m the sort of person who can’t help 
caring.) But say I’m another kind of doctor—someone who is bored to 
death with patients and their complaints. I don’t know why I took up 
medicine in the first place, because I don’t really want to work and help 
anyone. I'll do it if there’s something in it for me, but what do I care if a 
patient dies? It’s not the first time people have died, and it won’t be the 
last.” 


All of this shows that there is no basis whatsoever for Rosser’s opinion 
that “once the proof is discovered, and stated in symbolic logic, it can be 
checked by a moron.” The human mind is not at all well suited for 
analyzing formal texts. 


4 Deducibility 


4.1. Definition. A deduction of a formula P from a set of formulas & (ina 
language L in £,) is a finite sequence of formulas P,,..., P, = P with 
the property that for each /=1,...,7 at least one of the following 
alternative holds: 


(a) P,E&; 

(b) 4j <i such that P, is a direct consequence of P, using Gen; 

(c) aj,k <i such that P, is a direct consequence of P, and P, using 
MP. 


We shall write & | P to abbreviate “there exists a deduction of P from 
&.” A deduction of P, together with a precise indication for each i < n of 
which of the alternatives (a), (b), (c) and which indices / in case (b) or /, k 
in case (c) are used to obtain P,, is called a description of a deduction. A 
single deduction may have several descriptions. 

We usually consider deductions from sets & which contain Ax L, the 
logical axioms of L. The other elements of & may be formulas of L which 
are “guessed” to be true in the standard interpretation; these are called 
special axioms of L. (Examples will be given later in 4.6-4.9.) Such 
deductions may be considered the formal equivalents of mathematical 
proofs (of a formula P = P,, from the hypotheses &). This identification is 
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justified for the following reasons: 

(a) As shown in 3.3, if & C T,L for some interpretation ¢, and if & | P, 
then P € T,L; only true formulas can be deduced from true formulas. 

(b) A large amount of experimental work has been done on formalizing 
mathematical proofs, that is, replacing them by deductions in suitable 
languages of £,, especially L,Set. This work has shown that for large 
segments of mathematics, including the foundations of the theory of 
integers and real numbers, set theory, and so on, proofs can successfully be 
formalized as deductions within the framework of £,. There is much 
material on this theme in the literature on mathematical logic; see, in 
particular, Mendelson’s book. 

(c) Gédel’s completeness theorem for the logical modes of expression in 
£, (see §6) shows that any formula which is not deducible from & must be 
false in some model (interpretation) of &. 

For further discussion, see “Digression: Proof.” 

We occasionally consider deductions from another type of sets &. For 
example, we might remove from & certain logical axioms, such as the “law 
of the excluded middle” (BI in Section 3.4), in order to investigate 
formally intuitionistic principles. Or we might add to & a formula which 
we think is false in order to deduce a contradiction from &; this is the 
so-called “proof by contradiction.” 

We now prove some formal aspects of contradiction. 


4.2. Proposition. Suppose that & contains all tautologies of type B.2 in 
Subsection 3.4. Then the following two properties of & are equivalent: 


(a) There exists a formula P such that & -P and &|+ —P. 
(b) & + Q for any formula Q. 


A set & with these properties is called inconsistent. 


Proor. (b)=>(a) is obvious. Conversely, suppose &|-P and &} —P. We 
first add the formula —P-—>(P—>Q), which is assumed to lie in &, to the 
descriptions of the two deductions. Then, applying MP twice (to this 
formula and —P; then to P= Q and P), we obtain a description of a 


deduction & LQ. O 


4.3. A large part of the theorems of logic consists in proving assertions of 
the type “& | P” or “it is not true that & | P” for various languages L, sets 
&, and (classes of) formulas P. 

A result of the form &| P may be proved by presenting a description of 
a deduction of P from & . However, even in slightly complicated cases, this 
procedure becomes so long that it is replaced by more or less complete 
instructions on how to compose such a description. Finally, “& } P” may 
be proved without presenting even an incomplete description of a deduc- 
tion of P from &. In this case we “are not proving P, but are proving that 
a proof of P exists;” see the example in §8 concerning language extensions. 
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In rare cases a result of the form “it is not true that G-P” can be 
proved by a purely syntactic argument. But usually such a result is 
obtained by constructing a model, i.e., an interpretation, in which & is true 
and P is false; see the discussion of the continuum problem in Chapters 
IlI-IV. If it is true neither that S| P nor that &| —P, we say that P is 
independent of &. 

We now give two useful elementary results concerning deductions. It is 
clear that, compared with usual proofs, deductions are made up of very 
minor details. The mathematician, as if wearing seven-league boots, covers 
entire fields of formal deductions in one step. 


4.4. Lemma. Suppose that & contains all tautologies. If &|-P and &/Q, 
then &-P AQ. 

Proor. If P,,..., P,, and Q,,..., Q, are deductions of P and Q, respec- 

tively, then 


Pies +s Pr Qtr» ++ + On P=>(Q=>(PAQ)), Q>(PAQ), PAQ 


is a deduction of P A Q. The third formula from the end is a tautology; the 
second formula from the end is a direct consequence of this tautology and 
P,, = P using MP; and the last formula is a direct consequence of the 
second to last and Q, = Q using MP. O 


4.5. Deduction Lemma. Suppose that & > Ax L and P is a closed formula. If 
& U {P}Q, then &| P=>Q. 


Proor. Let Q,,..., Q, = Q be a deduction of Q from & U {P}. We show 
by induction on n that there exists a deduction of P= Q from &. 

(a) n=1. Then either Q € &, or else Q = P. In the first case P=@Q is 
deduced from Q and the tautology Q=(P=Q) using MP. In the second 
case P= P is a tautology. 

(b) n > 2. We assume that the lemma holds for deductions of length 
<n—1. Then &| P= @Q, for all i < n—1. Further, we have the following 
possibilities for Q, = Q: (b,) Q € &; (b,) Q= P; (b;) Q is deduced from 
Q, and Q,=(Q,>Q) using MP; and (b,) Q has the form Vx Q; for 
j<n-—1. The first two cases are handled in exactly the same way as for 
n=l. 

In case (b;), P= Q can be deduced from & in the following way: 


(1) deduction of P=@Q_ (induction assumption) 

(2) deduction of P=(Q,;=@Q) (induction assumption) 
(3) (P3(Q,> 2) =(P=Q)=(P>Q)) (tautology) 
(4) (P=Q)=>(P=}@Q) (from (2) and (3) using MP) 
(5) P=Q_ (from (1) and (4) using MP). 


From now on, arguments of this sort will be presented more briefly, with 
explicit mention of only the last steps of the induction (here (3), (4), and 
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Finally, in case (b,), we obtain a deduction of P=>Wx Q, from & if we 
add the following formulas to the deduction of P=@Q, from & (which 
exists by the induction assumption): 


Vx(P=3Q,) (Gen) 
Vx(P>Q,)=(P>VxQ,) (logical quantifier axiom, since P is closed) 
P=>VxQ, (MP applied to the two preceding formulas). 


The lemma is proved. oO 


We record for future reference that, in the parts of deductions con- 
structed in Lemmas 4.4 and 4.5, only tautologies of the type AO, Al, and 
A2 in Subsection 3.4 were used. 

We now give some basic examples of special axioms. 


Axioms of equality 

Let L be a language in £, whose alphabet includes a relation = of rank 
two. We shall write ¢, = ¢, instead of =(¢,, t,). If P is a formula, x is a 
variable, and ¢ is a term, we let P(x, t) denote the result of substituting ¢ in 
P in place of any or all of the free occurrences of x in P for which ¢ is free. 


4.6. Proposition. 
(a) The formulas 


t= =heh=thh ehAh=4h>h=h) 
x =t—(P(x, x)= P(x, 1) 


are -true for any interpretation of L in which o(=) is equality. 
(b) All the formulas in (a) are deducible from the set 


Ax LU {x = x|x is a variable} 
U {x =y=3(P(x, x)= P(x, y))|P is an atomic formula}. 


The formulas in this list, except for Ax L, are called the axioms of 
equality. 

(c) Let be any interpretation of L in a set M for which the axioms of 
equality are true. Then ¢(=) is an equivalence relation in M which is 
compatible with the interpretations of all the relations and operations of L 
in M. If '-denotes the obvious interpretation of L in the quotient set 
M’ = M/(=), then $'(=) is equality, and T,L = T,L. 


PROOF (SKETCH) 

(a) The ¢-truth is easily established. We illustrate this by showing that 
the last formula is ¢-true. Suppose it were false at a point € M. Then 
|x = ¢|(€) = 1, |P|(€) = 1 and | P(x, 2)|() = 0. The first assertion means that 
x* = #*, But then |P|(€) =| P(x, )|(6) by Proposition 2.10, contradicting the 
second and third assertions. 
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(b) Deduction of t= 1: x =x (axiom of equality); Wx(« = x) (Gen); 
Vx(x = x)=1= ¢ (logical axiom of specialization); t = t (MP). 
Deduction of ¢) = t,t = t,: 


(1) x =y=(x = x=>y = x) (axiom of equality with = for P) 

(2) Q=((P=(O= R))=(P=>R)), where P is x=y, QO is x= x, R is 
y = x (tautology) 

(3) x = x (axiom of equality) 

(4) (P=(Q= R))=(P=R) (MP is applied to (2) and (3)) 

(5) x =y=y = x (MP applied to (1) and (4)). 


We then twice apply Gen, the axiom of specialization, and MP, in order to 
deduce the formula ¢, = ¢,=> 1, = 4, from (5); we replace ¢, by ¢, and t, by 
t, to deduce #, = ¢,; => 7, = t,; we use Lemma 4.4 to deduce the conjunction 
of these two formulas; and, finally, the tautology (1; = t,=> = 4)A(h= 
hh = 6), =b<t, =1,), together with MP, gives the required for- 
mula. 

The deduction of the third and fourth formulas in (a) will be left to the 
reader. The existence of a deduction of the fourth formula can be proved 
by induction on the number of connectives and quantifiers in P. P is 
represented in the form —1Q, Q, * Q,, Vx Q, or dx Q; we assume that the 
formula with Q, Q,, and Q, in place of P has already been deduced, and 
we complete the deduction for P (see Mendelson, Chapter 2, Proposition 
2.25). 

(c) If the axioms of equality are $-true, then so are the formulas in (a), 
since they are deducible. The first three formulas in (a), applied to three 
different variables x, y, and z, then show that the relation ¢(=) on M is 
reflexive, symmetric, and transitive. In fact, let X, Y, and Z be any three 
elements of M, let £€ M be a point such that x = XY, y= Y, and z* = Z, 
and let ~ be the relation ¢(=) on M. The ¢-truth of the formulas in (a) 
means that 

X~X; X~YeYr~y; X~Yand Y~Z=3X ~Z. 

By definition, to say that ~ is compatible with the ¢-interpretation of 
all relations and operations on M means the following. Let p be a relation, 
and let ¢(p) C M’ be its interpretation. If (X,,..., X,> €o(p) and X/ ~ 
X,, then (X,,..., X/,..., X,> € o(p). Now let f be an operation, and let 
o(f) : M’=M be its interpretation. If o(f)\(X,,..., X,) = Y and X/ ~ X;, 
then o(f\(X,,...,X/,...,X)J = Y'~ ¥. 

We verify this compatibility by using the ¢-truth of the last formula in 
4.6(a) at a suitable point €€ M. Here we take the formulas p(x,,..., x,) 
and f(x,,...,x,) =, respectively, for P; we take the variable x; for ¢ and 
the variable x, for x; and we set x = X,, x’§ = X/, and y* = Y. 

It follows from the compatibility that we can construct an interpretation 
¢ of L in M’= M/~, such that $‘(p)=¢(p) mod~, ¢(f) = ¢(f) mod 
~, and ¢'(=) is equality. The last formula in 4.6(a) will then imply that all 
the o-true formulas remain #’-true, and conversely. Cl 
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From now on, when we speak of the special axioms for any language in 
£, having the symbol =, we shall without explicit mention always include 
among them the axioms of equality for =. Models in which = is 
interpreted as equality are called normal models. 


Special axioms of arithmetic 


4.7. Proposition. The following formulas are true in the standard interpreta- 
tion of L,Ar, and are called the special axioms of L,Ar: 


(a) The axioms of equality. 
(b) The axioms of addition: 


x=O=x; xtyeytxus(xty)+z=x4+(y +2); 
x+zSytz>x=y. 


(c) The axioms of multiplication: 
x-0=0; x Ll=x; XY Sy; (x-y)-z = x+(y-z). 
(d) The distributive axiom: 
xe(ytz)=x-yptxez. 
(e) The axioms of induction: 
P(0) AWx(P(x)> P(x + 1))=vx P(x), 
where P is any formula in L,Ar having one free variable. 


The proof is trivial and will be left to the reader. We only note that the 
“proof” that the induction axioms are true itself uses induction. 


Remarks 

(a) In (b), (c), and (d) above, we have written the usual axioms for a 
commutative (semi) ring in order to shorten the formal deductions; any 
informal computation which only uses these axioms can easily be trans- 
formed into a formal deduction of the result of the computation in L,Ar. 
In Chapter 3 of Mendelson’s textbook, he gives an apparently weaker set 
of axioms, and then shows how to deduce our formulas from them. This 
takes up 5-6 pages of text, and is basically a tribute to a historical tradition 
going back to Peano. 

(b) The induction axioms are a countable set of formulas in L,Ar; it is 
customary to say that 4.7(e) is an axiom schema. The corresponding fact in 
intuitive mathematics is stated as follows; “For any property P of non- 
negative integers, if 0 has the property P, and, whenever x has the property 
P, x +1 also has the property P, then all nonnegative integers have the 
property P.” Here “property of nonnegative integers” means the same as 
“any subset of the nonnegative integers.” 

However, in the means of expression of L,Ar there is no way to say 
“any subset.” Neither is there any way to say “all properties;” we can only 


43 


II Truth and deducibility 


list one-by-one the properties that are definable by formulas in the lan- 
guage. We recall that there are only countably many such properties, while 
the intuitive interpretation refers to a continuum of properties. Thus, the 
formal axiom of induction is weaker than the informal one, and is also 
weaker than the version of this axiom that is obtained by imbedding L,Ar 
in L,Set. 


Special axioms of Zermelo—Fraenkel set theory 
(see the description of V in the Appendix to Chapter I) 


4.8. Proposition. The following formulas are true in the standard interpreta- 
tion of L,Set in the von Neumann universe V: 


(a) Axiom of the empty set: Wx (x © @). 

(b) Axiom of extensionality: Wz(z EC xeoz EC y)ex=y. 

(c) Axiom of pairing: Vu Ww Ax Wz(z © xeez=u\/z=w). 

(d) Axiom of the union: Vx Ay Wu(Az(uEzAzexjoucy). 

(e) Axiom of the power set: Wx Ay Vz(z Cxeez Ey), where z Cx is 
abbreviated notation for the formula Vu(u © zu © x). 

(f) Axiom of regularity: Vx( mx = @=>Ay(y © x Ay x = @)), where 
yx =@ is abbreviated notation for 44Az(z Ey (\z € x). 


PROOF AND EXPLANATIONS. This is not a complete list of the axioms of 
Zermelo—Fraenkel; the axiom of infinity, axiom of replacement, and also 
the axiom of choice, which are more subtle, will be discussed in the next 
subsection. 

(a) The truth of these formulas must, of course, be proved by computing 
the function | | using the rules in 2.4 and 2.5. We do this, for example, for 
the axiom of extensionality. Let € be any point in the interpretation class, 
and let X = x, Y = y*. We must show that 


Vz(z Exeoz Ey)|(€) = |x = yl(§), 
i.e., that 


min (|Z €X| |Z € Y|+(1-|Z EX|1-|Z © Y|)) =|X= Y|, 
i 


where we have written |Z © X| instead of |z € x|(&’) with z* = Z, x* = X, 
and so on. But the left-hand side equals | if and only if for every Z EV 
either both Z © X and Z € Y, or else both Z € X and Z € Y, that is, if 
and only if ¥ = Y. 

More generally, if we replace V by any subclass M C V and restrict the 
standard interpretation of L,Set to M, then the same reasoning shows that: 

The axiom of extensionality is true in M if and only if for any elements 
X, Y EM we have 


X=YexnrM=YnM, 


i.e., if and only if every element of M is uniquely determined by its elements 
which lie in M. This result will be used later. 
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The analogous computations for all the other axioms will be given 
systematically in a much more difficult context in Chapter HI. Hence, at 
this point we shall only explain how to translate them into argot, as in 
Chapter I, and why they are fulfilled in V. 

(b) The axiom of the empty set does not need special comment. We only 
remark that, if we interpret L,Set in a subclass M C V, then the constant 
@ may be interpreted as any element X € M with the property that 
X 1 M=@Q, and this axiom will still hold. 

(c) The axiom of pairing is true, because, if U, W € V,, then {U, W} © 
#(V,,,), so that all pairs lie in V. 

(d) The axiom of the union is true, because, if X € V, then the set 
Y= UzeyZ also lies in V. In fact, if X EV,,,=P(V,), then the 
elements of X are subsets of V,, and their union therefore lies in V, , ;. 

(e) The axiom of the power set is true, because if X € V, then P(X) € 
V. In fact, if X € V,, then X C V,, and hence P(X) c P(V,) = V4, $0 
that P(X) EV... 

(f) The axiom of regularity is true, because any non-empty set ¥ © V 
has an empty intersection with at least one of its elements; in this form the 
axiom is proved in the Appendix to this chapter. 


4.9. The axioms of L,Set in Subsection 4.8 have one property in common: 
their simplest model in the standard interpretation is precisely the union 
V.,= UnnoY, of the first w, levels of the von Neumann universe. In other 
words, this is the set of hereditarily finite sets ¥ € V, i.e., those such that, 
if X,EX,_,€ +++ €X =X, then all the x, are finite. 

V,,, is the reliable, familiar world of combinatorics and number theory. 
Additional principles are needed to force us out of this world. There are 
two such principles: the axiom of infinity and the axiom schema of 
replacement. 

(a) Axiom of infinity: 


Ax(@ Ex AVy(y Ex={y} Ex)). 


Here {y} € x is abbreviated notation for 4z(z ={y, vy} Az © x), where 
the meaning of z = {y, y} was explained in 3.7 of Chapter I. This axiom 
requires that we add to V,,, some set containing the elements 


©, {D}, {{O}}, ... (a countable sequence). Then, in order to preserve the 
intuitive version of the axiom of the power set, we must add 
P(X), P2(X),..., thereby hopelessly leaving the realm of finite sets, 


countable sets, continua, and so on. 

It is a striking fact that none of this is necessary in the formal, as 
opposed to intuitive, version of set theory, where we can always limit 
ourselves to hereditarily countable submodels of V. This important fact 
will be discussed in detail in §7. 

(b) Axiom schema of replacement. We introduce the following con- 
venient abbreviated notation (in any language of ©, having the notion of 
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equality): Aly P(y) means Jy P(y) A Wx Vy(P(x) A P(y) =x =). 
Thus, this formula is read: “There exists a unique object y with the 
property P,” where we assume that = is interpreted as equality. When 
other variables besides y occur freely in P, the formula A!y P(y) is true 
precisely when P determines y as an “implicit function” of the other 
variables. 

We can now write the replacement axioms. In the formula P below we 
list all the variables which occur freely in P: 


Vz,-°° Wz, Vu(Wx(x CUS Aly Pia cis) 
=>dw Vy(y Ewedx(x Gu P(x, y,2,..-, z,)))). 


The hypothesis says that “P gives y as a function of x € u (for given values 
of the parameters z,,..., z,)”; the conclusion says that “the image of the 
set u under this function is some set w.” 

From the standpoint of the formal theory it is worthwhile to note that 
from this axiom and the axioms of equality are deducible the so-called 
separation axioms, namely: 


Vz,- °° Wz, Vx dy Wu(uE yeouex/ Plu, z),...,Z,)). 


This says that if we take the class of sets having a property P and intersect 
it with a set x, we obtain a set. 

The replacement axioms should be looked at very carefully. They go 
beyond the usual, “intuitively obvious” working tools of the topologist and 
analyst. The axioms assert that, for example, it is impossible to “stretch” 
an ordinal a too far by means of a function f; for any f we choose, there is 
always an ordinal £ such that all the values f(y), y < a, lie in Vz. In other 
words, the universe V is incomparably more infinite than any of its levels 
V,. 
Even if we adopt this axiom, questions remain which are very similar in 
style, which are beyond the reach of our intuition, and which are not 
solvable using this and the other axioms. For example, do there exist 
so-called inaccessible cardinals y? One of the properties of an inaccessible 
cardinal y is the following: if f is a function from V, to V, (with a < y), 
then the set of values of f is an element of V,. In particular, there is an 
“upper bound” beyond which ordinals not exceeding y cannot be 
“stretched.” Do such infinities exist or not? 

After thinking about this and related problems, many specialists on the 
foundations of mathematics have come to the conclusion that such lan- 
guages of set theory as L,Set with a suitable axiom system are the only 
reality one should work with, and any attempt to make intrinsic sense out 
of the universe V or similar models is in principle doomed to failure. In 
particular, the set of formulas in L,Set which are true in the standard 
interpretation is not defined, and we can only talk about formulas which 
are deducible from the axioms. 
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But we shall not entirely adopt this point of view for several reasons. 
The simplest reason is the feeling that a language without an interpretation 
not only loses its intrinsic justification, but also cannot be used for 
anything. We cannot even play the “formal game” well unless we master 
the intuitive concepts which give meaning to the symbols. A language 
(along with the external world) helps bring order and precision to these 
intuitive concepts, which, in turn, make us change the language or at least 
revise our earlier linguistic constructions. But we can never assume that we 
have achieved complete clarity. 

We should understand the need for certain types of self-restraint. 
However, intellectual asceticism (like all other forms of asceticism) cannot 
be the lot of many. 

(c) Axiom of choice: 


Vx/( qx = Ody (“y is a function with domain of definition x” 
AWu(uEx A mu = O>4w(w Eu/\“iu, wd €y”)))). 


That is, y chooses one element from each nonempty element u € x. 

The belief that this axiom is true in V is at least as justified as the belief 
in the existence of V itself. Over the past fifty years it has become 
customary for every working mathematician to accept this axiom, and the 
heated controversies about it at the beginning of the century are now all 
but forgotten. The interested reader is referred to Chapter II of Founda- 
tions of Set Theory by Fraenkel and Bar-Hillel (North-Holland, Amster- 
dam, 1958). 


4.10. General properties of axioms. Despite the wide variety of concepts 
reflected in these axioms, each of our sets of axioms for languages in £, 
(tautologies; Ax L; special axioms of L,Ar and L,Set) have the following 
informal syntactic characteristics: 


(a) An algorithm can be given which tells whether any given expression is 
an axiom (compare: the syntactic analysis in §1 and the verification of 
the tautologies in Subsection 3.4). 

(b) A finite number of rules can be given for generating the axioms. 


It is clear that, a priori, property (b) is less restrictive than (a). In fact, 
an algorithm as in (a) can be transformed into a rule for generating the 
axioms: “Write out all possible expressions one by one in some order, and 
take those for which the algorithm gives a positive answer.” 

It is actually natural to suppose that property (a) should characterize 
axioms, and property (b) should characterize deducible formulas, no 
matter how we explicitly describe the axioms and the deducible formulas 
in a given language. In Part III we make these intuitive ideas into precise 
definitions and show that (b) is strictly weaker than (a). See also the 
discussion in Subsection 11.6(c) of this chapter. 
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Digression: proof 


1. A proof only becomes a proof after the social act of “accepting it as a 
proof.” This is as true for mathematics as it is for physics, linguistics, or 
biology. The evolution of commonly accepted criteria for an argument’s 
being a proof is an almost untouched theme in the history of science. In 
any case, the ideal for what constitutes a mathematical demonstration of a 
“nonobvious truth” has remained unchanged since the time of Euclid: we 
must arrive at such a truth from “obvious” hypotheses, or assertions which 
have already been proved, by means of a series of explicitly described, 
“obviously valid” elementary deductions. 

Thus, the method of deduction is a method of mathematics par excel- 
lence. (“Mathematical induction” clearly comes out of the same tradition. 
Peano’s induction principle allows us to write only the first step and the 
general step of a proof, and is thereby in some sense the first metamathe- 
matical principle. This point is observed by the tradition of listing Peano’s 
axiom among the special axioms (see 4.7(e)), but, one way or another, it is 
one of the archetypes of mathematical thought.) 

The longer the deductive argument, the more important it is for all its 
elementary components to be written in an explicit and normalized fash- 
ion. In the last analysis, the amount of initial data in formal mathematics 
is so small that failure to observe the rules of hygiene in long deductions 
would lead to the collapse of the system if we did not have external checks 
on the system. In induction, on the other hand, relatively short deductions 
are based on a vast amount of initial information. Darwin’s theory of 
evolution is explained to school children, but life is not long enough to 
judge how persuasive the proofs are. We see a similar situation in com- 
parative linguistics when the features of the so-called protolanguages are 
reconstructed. In such uses of induction, the “rules of deduction” cannot 
be so very rigid, despite the critical viewpoint of the neo-grammarians. 


2. The above observations concerning the method of deduction are sup- 
ported by the fact that the notion of a formal deduction in languages of £, 
is a close approximation to the concept of an ideal mathematical proof. It 
is therefore enlightening to examine the differences between deductions 
and the arguments we use in day-to-day practice. 


(a) Reliability of the principles. Not only the mathematics implicit in the 
special axioms of L,Set and L,Ar, but even the logic of the languages of , 
is not accepted by everyone. In particular, Brouwer and others have called 
into question the law of the excluded middle. From their extremely critical 
perspective, our “proofs” are at best harmless deductions of nonsense out 
of falsehood. 

The mathematician cannot permit himself to be completely deaf to 
these criticisms. After thinking about them for a while, he should at least 
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be willing to admit that proofs can have objectively different “degrees of 
proofness.” 


(b) Levels of “proofness.” Every proof that is written must be approved and 
accepted by other mathematicians, sometimes by several generations of 
mathematicians. In the meantime, both the result and the proof itself are 
liable to be refined and improved. Usually the proof is more or less an 
outline of a formal deduction in a suitable language. But, as mentioned 
before, an assertion P is sometimes established by proving that a proof of 
P exists. This hierarchy of proofs of the existence of proofs can, in 
principle, be continued indefinitely. We can take down the hierarchy using 
sophisticated logical and set theoretic principles; however, not everyone 
might agree with these principles. Papers on constructive mathematics 
abound with assertions of the type: “there cannot not exist an algorithm 
which computes x,” whereas a classical mathematician would simply say 
“x exists,” or even “x exists and is effectively computable.” 


(c) Errors. The peculiarities of the human mind make it impossible in 
practice to verify formal deductions, even if we agree that, in principle, 
such a verification is the ideal form for a proof. Two circumstances act 
together with perilous effect: formal deductions are much longer than texts 
in argot, and humans are much slower at reading and comprehending such 
formal arguments than texts in natural languages. 

A proof of a single theorem may take up five, fifteen, or even fifty 
pages. In the theory of finite groups, the proofs of the two Burnside 
conjectures occupy nearly five hundred pages apiece. Deligne has esti- 
mated that a complete proof of Ramanujan’s conjecture assuming only set 
theory and elementary analysis would take about two thousand pages. The 
length of the corresponding formal deductions staggers the imagination. 

Hence, the absence of errors in a mathematical paper (assuming that 
none are discovered), as in other natural sciences, is often established 
indirectly: how well the results correspond to what was generally expected, 
the use of similar arguments in other papers, examination of small sections 
of the proof “under the microscope,” even the reputation of the author—in 
short, its reproducibility in the broadest sense of the word. “Incomprehen- 
sible” proofs can play a very useful role, since they stimulate the search for 
more accessible arguments. 

The last two decades have seen the appearance of a very powerful 
method for performing long formal deductions, namely the use of com- 
puters. At first glance, it would seem that the status of formal deductions 
might greatly improve, so that the Leibnizian ideal of being able to verify 
truth mechanically would become attainable. But the state of affairs is 
actually much less trivial. 

We first give two authoritative opinions on this question by C. L. Siegel 
and H. P. F. Swinnerton-Dyer. Both opinions relate to the solution by 
computer of concrete number theoretic problems. 
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3. The present level of knowledge concerning Fermat’s last theorem is as 
follows. Let p be a prime. It is called regular if it does not divide the 
numerator of any of the Bernoulli numbers B, = %, By= 3..--, By—3- 
Fermat’s theorem was proved for regular prime exponents by Kummer. 
For irregular p there is a series of criteria for Fermat’s theorem to hold. 
These criteria reduce to checking that certain divisibility properties do not 
hold; if they hold, we must try certain other divisibility properties, and so 
on. The verification for each p requires extensive computer computations. 
As of 1955, this was successfully done for all p < 4002 (J. L. Selfridge, C. 
A. Nicol, H. S. Vandiver, Proc. Nat. Acad. Sci. USA, 41, 970-973 (1955)). 

Let v(x) denote the ratio of the number of irregular primes < x to the 
number of regular primes < x. Kummer conjectured that v(x) 4 as 
x— 00. Siegel (Nachrichten Ak. Wiss. Gottingen, Math. Phys. Klasse, 1964, 
No. 6, 51-57) suggests that Ve —1 is a more likely value for the limit, 
supports this opinion with probabilistic arguments, compares with the data 
of Selfridge-Nicol-Vandiver, and concludes this discussion with the 
following unexpected sentence: “In addition, it must be taken into account 
that the above numerical values for v(x) were obtained using computers, 
and therefore, strictly speaking, cannot be considered proved”! 


4. Siegel’s point of view can be explained as a natural reaction to 
information received secondhand. But the excerpts below are from an 
article by a professional mathematician and experienced computer pro- 
grammer (Acta Arithmetica, XVIII, 1971, 371-385). The article is devoted 
to the following problem: 


“Let L,,L,,L, be three homogeneous linear forms in u,v,w with real 
coefficients and determinant A; and suppose that the lower bound of 
|L,L2L| for integer values of u,v,w not all zero is 1.” What can be said 
about the possible value for A? 

“The corresponding problem for the product of two linear forms is 
much easier, and was essentially completely solved by Markov. There are 
countably many possible values of A less than 3, each of which has the 
form 

A=(9-4n7?)” 
for some integer n; the first few values of n are 1, 2, 5, 13, 29, and there is 
an algorithm for constructing all the permissible values of n.” 


For three forms Davenport (1943) proved that A= 7 or A=9 or A> 9.1. 
In Swinnerton—Dyer’s paper, all values of A < 17 are computed under the 
assumption that there are only finitely many such values and he gives a list 
of them: the third value is 148, and the last (the eighteenth) is V 2597/9 . 
Discussing this result, he makes a very interesting comment: 


“When a theorem has been proved with the help of a computer, it is 


impossible to give an exposition of the proof which meets the traditional 
test—that a sufficiently patient reader should be able to work through the 
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proof and verify that it is correct. Even if one were to print all the 
programs and all the sets of data used (which in this case would occupy 
some forty very dull pages) there can be no assurance that a data tape has 
not been mispunched or misread. Moreover, every modern computer has 
obscure faults in its software and hardware—which so seldom cause 
errors that they go undetected for years—and every computer is liable to 
transient faults. Such errors are rare, but a few of them have probably 
occurred in the course of the calculations reported here.” 


The arguments on the positive side are also very curious: 


“However, the calculation consists in effect of looking for a rather 
small number of needles in a six-dimensional haystack; almost all the 
calculation is concerned with parts of the haystack which in fact contain 
no needles, and an error in those parts of the calculation will have no 
effect on the final results. Despite the possibilities of error, I therefore 
think it almost certain that the list of permissible A< 17 is complete; and 
it is inconceivable that an infinity of permissible A<17 have been 
overlooked.” 


His conclusion: 


“Nevertheless, the only way to verify these results (if this were thought 
worth while) is for the problem to be attacked quite independently, by a 
different machine. This corresponds exactly to the situation in most 
experimental sciences.” 


We note that it is becoming more and more apparent that the process- 
ing, and also the storage, of large quantities of information outside the 
human brain lead to social problems which go far beyond questions of the 
reliability of mathematical deductions. 


5. In conclusion, we quote an impression concerning mechanical proofs, 
even ones done by hand, which is experienced by many. 

After stating a proposition to the effect that “the function Ty, no is 
correctly defined,” a gifted and active young mathematician writes ({nven- 
tiones Math., vol. 3, £.3 (1967), 230): 


“The proof of this Proposition is a ghastly but wholly straightforward 
set of computations. It took me several hours to do every bit and as I was 
no wiser at the end—except that I knew the definition was correct—I 
shall omit details here.” 


The moral: a good proof is one which makes us wiser. 


5 Tautologies and Boolean algebras 


5.1 Proposition. A finite list, or “basis,” of tautologies—logical polynomials 
in three variables P, 0, R—can be given with the following property. 

Let L be any language in £,, and let & be the set of all formulas in L 

which can be obtained from the basis tautologies by substituting all 
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possible formulas in place of P, Q, R. Then any tautology in L is deducible 
from 5 using only the rule of deduction MP. 


The choice of the basis tautologies is by no means unique. Our list will 
consist of the tautologies AO, Al, A2, A3, B1, B2 in Subsection 3.4 and the 
following tautologies: 


Cl 7A(P=>70)3(PAQ), (PA Q)= (P= FQ). 
C2 (4PSQ)=(PV Q), (PV Q)=(7P>Q). 

C3 P=( 30> 4(P>0Q)). 

C4 (P=Q)3(( 4P>Q)>0Q). 

C5 (P=Q)=(70=>—P). 

C6 (P=Q0)3((O>P)>(PSQ)). 

C7 (P2Q)=>(P=>Q), (P2@Q)=(O=>P). 


We are not trying to economize on the size of the basis, but rather on the 
length of the proof of Proposition 5.1; hence, AO-C7 is not the shortest 
possible list. This does not make any difference for studying the logic of 
£,; but the study of modified logical systems, for example those of the 
intuitionist type, requires more careful analysis of this list. 


PROOF OF PRoposITION 5.1. Let & be a finite set of formulas in L, and let 
P be a logical polynomial (with a fixed representation) over &. For any 
map v : & >{0, 1}, we extend v to P using the same rules that defined the 
truth function | | in Subsection 2.5. We set 


P, ife(P)=1 


p= 
AP, ifv(P)=0. 


5.2. Fundamental Lemma. Let &° = {Q°|Q © & }. Then for any v we have: 
& U &° LE P® (using MP). 


This lemma expresses the following idea. It is natural to prove Proposi- 
tion 5.1 by induction on the length of the tautology. However, the 
component parts of a tautology themselves might not be tautologies. The 
operation of taking P to P® forces any formula to be “v-true” and makes it 
possible for us to use induction. 


5.3. PROOF OF 5.1 ASSUMING THE FUNDAMENTAL LEMMA. Let P be a 
tautology, so that P® = P for all v. Set 6 ={P,,..., P,}. By the funda- 
mental lemma,  U {Pf,..., P.?}}P using MP for any v: We show that 
then $ U{P?,..., P’_,}}P using MP. Descending induction on r then 
gives the required assertion (the assumption that P is a logical polynomial 
in P,,..., P, is not used in the induction step). 

The Deduction Lemma 4.5 shows that 3 U{P?,..., P®.;}}(P?=>P) 
using MP; to see this we need only examine the proof and notice that the 
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deduction only used MP and the tautologies in %, since the rule of 
deduction Gen was not needed. 

Since for any v there exists a v’ which coincides with v on P),..., P,_, 
but takes a different value on P,, it follows that: P= P and “P,=>P are 
deducible from % U {P?,..., P”.,} using MP. On the other hand, the 
tautology C4: (P,> P)>(( AP,>P)=>P) lies in §. Applying MP twice, 
we deduce P. oO 


5.4. PROOF OF THE FUNDAMENTAL LEMMA. We use induction on the number 
of connectives in the representation of P as a logical polynomial over &. If 
there are no connectives, that is P € &, then the assertion is obvious. 
Otherwise, P has the form —Q or Q, * Q,, where * is one of the binary 
connectives. 

(a) The case P= 7Q. If v(Q@)=0, then Q° = 4Q= P= P?. That 
Q” = P” is deducible from % U &” is precisely the induction assumption. 

On the other hand, if o(Q)= 1, then O° = Q, P? = — —@. Here (@ is 
deducible from % U &° by the induction assumption, and then the tautol- 
ogy 0= 7 7Q in & along with MP gives a deduction of P°. 

(b) The case P= Q, * Q. For the different connectives and possible 
values of v(Q,) and v(Q,) we first tabulate the formulas for which 
deductions exist by the induction assumption and the formulas for which 
we must find deductions. In the columns under /\ and \ we give formulas 
from which (Q, A Q,)° and (Q, V Q,)’, respectively, are deducible using 
MP and the tautologies in % (tautologies Cl, C2, and C5). Hence it 
suffices to find deductions of each of formulas 1-16 from % and the pair 
of formulas on the appropriate row in the second column using MP. 

Deduction of formulas 1-16. 


Given: 
deductions of Must Find: Deduction of (Q, * Q2)” 
AC ere; 
0 0 70; 72, 1.0>Q2 5. 1 71(2:> 7Q2) 
0 1 101, Q2 2. OQ, 6. 4 2(01> 7Q2) 
1 0 Q;, 70. 3. 7(0,>Q2) 7, 1-721 71Q2) 
1 1 Q1, Qo 4.2; Q) 8. (21> 7Q)) 
v(Q;) 0(Q2) Oy and QF V ° 


0 0 102;, TQ? 9. A(70,:>90)) 13. Q,2Q, 
0 1 710), Q2 10. 4Q;>Q, 14. 7(Q\ Q)) 
1 0 Q1 712. Il. 4Qi>Q, 15. (Qi: Q2) 
1 1 1, Q2 12. 2Q:>Q, 16. 0,Q, 


Note that if P is deducible then for any Q the formula Q= P is also 
deducible (tautology Al and MP) and if —P is deducible then for any Q 
the formula P= Q is deducible (tautology B2 and MP). This immediately 
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yields deductions of 1, 2, 4, 10, and 12. If we remove the double negations 
in the A, column using tautology BI and MP, we obtain deductions of 5, 6, 
and 7. And 11 is deducible since by B1 the second column yields a 
deduction of —1 “Q,. In the first and last rows the deductions of 1 and 4 
yield deductions of Q,=>Q, by symmetry; tautology C6 and MP twice 
give a deduction of 13 and 16 from Q,=>Q, and Q,=> Q,. 


3 is deduced from C3: Q,>( 7Q,> 7(Q;=@Q,)) and the second col- 
umn using MP twice. 

8 is deduced from C3: Q,=>( 3 4Q,= —(Q,;= 7Q,)) and the second 
column using MP, applying B1 to Q,, and again using MP. 

9 is deduced from C3: 4Q,=>( 70, —( 70, Q,)) using MP twice. 

15 is deduced from 3 by C7 and C5 and MP twice. 

Finally, the deduction of 3 from Q, and “1Q, yields by symmetry a 
deduction of —(Q,=Q,) from —Q, and Q,. Hence on the second row 
the deduction of 14 is analogous to that of 15. 


Proposition 5.1 is proved. oO 


5.5. Tautologies and probability. Tautologies are statements which are true 
independently of the truth or falsity of their “component parts.” This 
assertion still holds even if the components of a tautology are assigned 
probabilistic truth values ||P|| in the algebra of measurable sets in some 
probability space. 

An example: the tautology R \V S \V 7R \/ 7S—“either it will rain, or 
it will snow, or it won’t rain, or it won’t snow” !—is a reliable weather 
forecast despite the great complexity of the meteorological probability 
space. 

For a precise result, it is convenient to use the terminology of Boolean 
algebras. 


5.6. Boolean algebras. A Boolean algebra B is a set with an operation of 
rank one, with two operations \/ and /\ of rank two, and with two 
distinguished elements 0 and 1, such that the following axioms hold: 


(a) (A’Y =A for all A © B; 

(b) A and \/ are each associative and commutative; 

(c) A and \ are distributive with respect to one another; 
(d) (aV by =a Ad (aAby=avVb'; 

(e) a\VVa=afa=a; 

(ff) lAa=a; 0\Va=a. 


EXAMPLES. 


(a) B is the set of all subsets of a set M, ’ is complement, /\ is intersection, 
V/ is union, 0 is the empty subset, and | is all of M. 


' A Russian proverb (translator’s note). 
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(b) B is the set of open-and-closed subsets of a topological space M with 
the same operations. 

(c) B is the algebra of measurable subsets (modulo measure zero subsets) 
of a probability space M with the same operations. 


In all of these cases B can be identified with the space of characteristic 
functions of the corresponding subsets of M (taking the value 1 on the 
subset and 0 on the complement). 


5.7. Boolean truth functions. Let B be a Boolean algebra, and let & be a set 
of formulas in a language L. Let || || : & +B be any map. We extend this 
map to the logical polynomials over & (more precisely, to their representa- 
tions) by means of the recursive formulas: 


IPQ = (PIANO VCP Ale), 
IP> QI =P i VII 
IPV QU =|P IVI: 
IPA QI =P IAT QI, 
|| aP I] = PI 


In the case B = {0, 1}, these formulas coincide with the definitions in 
2.5. We note that \/ and A have different meanings in the left- and 
right-hand sides. 


5.8. Proposition. Let the logical polynomial P be a tautology over &. Then 
for any map || || : © —B to any Boolean algebra B we have ||P\||=1. 


Proor. An example of a natural map || || can be obtained as follows: if we 
are given an interpretation of L in a set M, then the truth functions | P|(&) 
can be considered as the characteristic functions of the definable subsets of 
the interpretation class M (compare §2). Hence, our usual truth functions 
are essentially Boolean-valued. They are imbedded in the Boolean algebra 
of all subsets of M, which decomposes as a direct product of two-point 
Boolean algebras {0,1}. Hence the proposition follows trivially in this 
case. 

In the general case one could use Stone’s structure-theorem for Boolean 
algebras. However, instead of this we shall indicate how to reduce the 
problem to some simple computations using Proposition 5.1. Because of 
Proposition 5.1, it suffices to verify that the basis tautologies are || ||-true 
and that || ||-truth is preserved when we use MP. For example, if || P|| = 1 
and ||P=>Q||=1, then || P||’=0 while ||P ||’ V ||Q|| =1, so that ||Q|| =1 
by 5.6(f); this answers the question about MP. The truth values of the basis 
tautologies are computed in a similar manner using the axioms in5.6. © 


Boolean truth functions will be the basic tool in the presentation of 
Cohen forcing in Chapter III. 
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Digression: kennings 


1. The process in §5 generates all possible tautologies starting with a finite 
number of tautologies and using a finite number of rules. It has become 
very popular in modern linguistics to attempt to find a suitable description 
of natural languages by means of such generating rules (N. Chomsky and 
others; see, for example, the book Eléments de linguistique mathématique by 
A. V. Gladkii and I. A. Mel’éuk, Paris, Dunod, 1972). 

However, many psychologists consider that this conception has little to 
do with the actual process of speech. According to one such opinion, real 
speech has more in common with a game of chance, chasing a fugitive, or 
a river current near a jagged shoreline. The choice of the next word in a 
sentence is determined statistically both by a formulating principle (an 
idea, situation, or psychological state) and by the peculiarities of seman- 
tics, grammar, phonetics, and the associative cloud formed by the earlier 
words. 

There is reason to hope that formal grammars are more closely suited to 
describing special fragments of natural languages which are in some sense 
more rigidly defined, such as certain language fragments in poetry or law. 
In these fragments an essential role is played by “prohibitions,” which 
weed out, say, all texts not having a certain rhythmic pattern. Even the 
most casual attempt at writing poetry reveals the psychological reality of 
prohibitions in versification. But it is much less obvious that there is a set 
of generating rules which also has a psychological reality. 


2. Yet there has been at least one poetic system in which generating rules 
occupied an important place. One of the basic elements of skaldic (ancient 
Icelandic) poetry consisted of special formulas called kennings. A kenning 
is an expression which can replace a single word. For example, 


“storm of spears” is a kenning for “battle” 


“tree of battle” 
“bush of the helmet” 


are kennings for “warrior” or “man” 
“thrower of swords” 8 


“giver of gold” 
“sea of the wagon” is a kenning for “earth” 
“fire of war” is a kenning for “gold” 


“e $4. 
sky of sand F 
y are kennings for “‘sea,” and so on. 


“field of seals” 
A simple kenning is a kenning no part of which is a kenning. The 
examples above are all simple kennings. They play the role of axioms; 


obviously, only very great poets have the right to create new simple 
kennings. It falls to the lot of the lesser poets to create new kennings using 
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the rules of deduction. The rule of deduction of a new kenning from earlier 

kennings is as follows: any word in a kenning may be replaced by a (not 

necessarily simple) kenning for that word. Here is a complicated example 

of a kenning together with its decomposition into simple kennings (an 

actual example): 

“thrower of the fire of the storm of the witch of the moon of the steed of the ship stables” 
Se 

ship 


ee 
shield 


warrior or man 


The Soviet poet Leonid Martynov thought of kennings as metaphors (a 
fundamental error, although an understandable one—kennings and meta- 
phors play completely different structural roles in different poetic systems), 
and he wrote a poem “Songs of the Skalds” which ends as follows: 


...But perhaps the translators have gotten a bit carried away? 


No! 
In our times, too, 
might there not live 

some throwers 

of the fire 

of the storm 
of the witch 
of the moon 
of the steed 
of the ship stables, 
or 
Squanderers 
of the amber 
of the cold earth 
of the great boar? 
Anything is possible!! 
And who can be so very sure 
That there-are no longer songs 
which could be called 
Surf 
of yeast 
of the people 
of the bones 
of the fjord? 

Perhaps there really are such songs now, 


Who can tell?? 
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After all this, the professional opinion of M. I. Steblin-Kamenskii, 
whose book Icelandic Culture (Leningrad, Nauka, 1967) provided us with 
the above examples, sounds a little anticlimatic: “As a rule, any kenning 
for a man or warrior was no richer in content than the pronoun ‘he.’ ” 


EXERCISES: 


(a) Find the simple kennings from which the last two kennings in Martynov’s 
poem are deduced. 

(b) Construct the kennings of maximum length which are deducible from all the 
simple kennings in the above text. Prove that it is impossible to deduce longer 
kennings. 


6 Gédel’s completeness theorem 


6.1. Let L be a language in £2), let ¢ be an interpretation of L, and let TL 
be the set of o-true formulas. In §3 it was shown that the set T, i iS 
Gédelian: it is complete, does not contain a contradiction, is closed with 
respect to deduction, and contains all the logical axioms Ax L. We say that 
a set of formulas & in L is consistent if the set of formulas deducible from 
& does not contain a contradiction, i.e. if there is no P such that &}-P 
and &| —P; otherwise, we say that & is inconsistent. The basic purpose of 
this section is to prove the following converse of the result in §3: 


6.2. Theorem (Godel) 

(a) Any Gédelian set T is the set of 6-true formulas TL for a suitable 
interpretation of L in some set M having cardinality < card (alphabet of 
L) + No. (Here and below we always mean the cardinality of the alphabet 
without the variables.) 

(b) Any set of formulas & which contains Ax L and is consistent can be 
imbedded in a Godelian set. 


The model M which is constructed in the proof consists of expressions 
in some extension of the alphabet of Z, and thus has a somewhat artificial 
character. In the next section we show that, if we are given some natural 
interpretation (M, ¢) of L, then we can find a submodel having cardinality 
< card (alphabet of L) + Xp. 


6.3 Corollary. (Deducibility criterion). Let & 5 Ax L. 

(a) A formula P is deducible from & if and only if either & is 
inconsistent, or P is -true for all models of the set & having cardinality 
< card (alphabet of L) + Xo. 

(b) A formula P is independent of & if and only if both & U {P} and 
& U{ 4P} are consistent; by Theorem 6.2, this is true if and only if 
& U{P} and & U{ P} have models. 
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In what follows we shall often omit the verification that various formal 
deductions exist. If the reader wants to fill in such a verification, this can 
almost always be done more easily using deducibility criterion 6.3 than 
directly. 


PROOF OF THE COROLLARY 

(a) If & is inconsistent, then any formula can be deduced from & 
(Proposition 4.2). Suppose & is consistent and P is ¢-true for all models of 
&. Let P=Wx,--- Wx,P be the “closure” of P. To prove that &|-P we 
consider two cases. 

(a,) 6 U{ AP} is inconsistent. Then & u { P}-P, so that, by the 


Deduction lemma, &+ “P= P. The tautology (P= P)=P and MP 
give & ||P, and then the axiom of specialization and MP give &/P. 

(a,) & U { “P} is consistent. Then, by Theorem 6.2, the set & U { aP} 
has a model. In this model & is true and P is false, so that this case is 
impossible. 

(b) Suppose that P is independent of &, ie., neither P nor —P is 
deducible. Then, by part (a), there exists a model of & in which P is true 
and a model of & in which P is false. The converse is obvious. oO 


We now proceed to the proof of Gédel’s completeness theorem. 


6.4. Definition. Let & be a set of formulas in a language L. The alphabet of 
L is said to be sufficient for & if, for each closed formula Wx P(x) in 
& there exists a constant cp (depending on P) such that the formula 


Rp: Wx P(x)=> AP (cp) 
belongs to &. 


The intuitive meaning of Rp, is: “If not all x have the property P, then 
some concrete object cp can be found which does not have this property.” 
We say that the alphabet (rather than &) is “sufficient” or “insufficient” 
because, if & does not contain enough formulas of the type Rp, we can 
simply add all the Rp to &, while if there are not enough constants cp, we 
then have to add them to the alphabet of the language. 

The plan for proving Theorem 6.2 is as follows. We first prove the 
Fundamental Lemma: 


6.5. Fundamental Lemma. Jf a set of formulas & in a language L is 
consistent and complete and contains Ax L, and if the alphabet of L is 
sufficient for ©, then & has a model with cardinality < card (alphabet of 
L) + &o. 


The next two lemmas allow us to imbed any consistent & in a complete 
set, or in one for which the alphabet is sufficient. 
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6.6. Lemma. Jf & is consistent and contains Ax L, then there exists a 
consistent and complete set of formulas &'D &. 


6.7. Lemma. If & is consistent and contains Ax L, then there exist: 

(a) a language L’ whose alphabet is obtained from the alphabet of L by 
adding a set of new constants having cardinality < card (alphabet of 
L) + &>. 

(b) a set of formulas &' in L’ which is consistent, contains & and 
Ax L’, and has the property that the alphabet of L’ is sufficient for &'. 


However, these constructions get in each other’s way. If we complete a 
set & for which the alphabet is sufficient, we might obtain a set with an 
insufficient alphabet; if we add new constants, we increase the overall 
supply of formulas in the language, and thereby lose the completeness of 
& . Hence, we have to alternate the constructions in 6.6 and 6.7 a countable 
number of times in order to prove our last lemma: 


6.8. Lemma. Jf & > Ax L is consistent, then there exist: 

(a) a language L‘®) whose alphabet is obtained from the alphabet of L 
by adding a set of new constants having cardinality < card (alphabet of 
L) + &>. 

(b) a set of formulas &©) in L'® which is complete and consistent, 
contains & and Ax L‘®, and has the property that the alphabet of L‘> is 
sufficient for 6. 


After Lemma 6.8 is proved, Theorem 6.2 is obtained from the Funda- 
mental Lemma applied to 5° if we restrict the resulting model to L and 
&. 

We now prove the lemmas. The Fundamental Lemma is proved in 6.9, 
and Lemmas 6.5, 6.6, and 6.7 are proved in Subsections 6.10, 6.11, and 
6.12, respectively. 


6.9. PROOF OF THE FUNDAMENTAL LEMMA. We begin by explicitly con- 
structing the interpretation @ of L which will be our model for &. 

(a) By a constant term we mean a term in L which does not contain any 
symbols for variables. We let M = {7 | ¢ is a constant term} be a “second 
copy” of the set of constant terms, and we define the primary mappings of 
the interpretation ¢ of L in M as follows: 


o(c)=c (for any constant c); 
- ; (for each operation symbol f of 
oft, -..,6)= f(4,..-,¢,) degree r and all constant terms 
rere 2) 
7 ; if and only if p(t,,...,4)€6 
Kt. 4> Ep) (for each relation p of degree r 
and all constant terms ¢,,..., ¢,). 
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We now prove the following 


(b) Claim. Let P be a closed formula. Then |P|, = 1 if and only if P €&. 
(This claim implies that @ is a model for &. In fact, if P © & is not 


closed, then its closure Vx, - - - Wx, P is deducible from & using Gen, 
and hence, since & is complete and consistent, Vx, -- - Vx,P © &. By 
the claim, |Vx, -- - Vx,P|, = 1, so that |P|, = 1.) 


PROOF OF THE CLAIM. We use induction on the total number of quantifiers 
and connectives in P. We shall write |P| instead of |P|,. 

(b,) P is an atomic formula p(t), ..., ¢,). The claim follows from the 
definition of |P| and the list of primary mappings, since the ¢; are constant 
terms (or else P would not be closed). 

(b,) P= 7. If |P|=1, then |Q|=0 and Q ¢& by the induction 
assumption applied to Q; since & is complete, we have Q €6, ie., 
P ©&. On the other hand, if |P|=0, then |Q|=1 and Q €&, so that 
30 €& since & is consistent. 

(b,) P=(Q,=>Q,). We first show that if |P|=0 then P ¢&. In fact, in 
this case |Q,|=1 and |Q,|=0; by the induction assumption, Q, € &, Q, 
¢&; since & is complete, Q,€6&; using the tautology Q,>(7Q) 
=> —(Q,=> Q,)) and using MP twice yields &} 4(Q,=@Q,). Since & is 
complete and consistent, all closed formulas which are deducible from & 
belong to &; hence, =(Q,>Q,)= TP € &, so that P€&. 

We now show that, if P €&, then | P| =0. In fact, since & is complete, 
we then have “P= —(Q,>@Q,) € &. The tautologies 7(Q,>Q,)>Q, 
and -(Q0,> Q,)= —Q, and MP give &}Q, and & + —Q,, so that, since 
& is complete and consistent, Q,€ 6 and —Q,€6. By the induction 
assumption, |Q,| = 1 and |Q,| = 0, so that |P| =|Q,=> Q,| =0. 

(b,) P=Q,\V Q, or QO, AQ. Using the tautologies which express /\ 
and \/ in terms of = and —, we can reduce to the previous cases; we 
omit the details. 

(b,) P = WxQ. If x does not occur freely in Q, then |P| = 1 is equivalent 
to |Q|=1, ie., by the induction assumption, to Q €&. But Q €& is 
equivalent to Vx Q © &, in one direction using Gen and in the other 
direction using the axiom of specialization with t = x and then MP. 

We now assume that x occurs freely in Q. We first suppose that | P| = 1 
but P €&, and obtain a contradiction. If P &&, then AP €&, ie., 
Vx Q(x) & &. Since the alphabet of L is sufficient for &, it follows that 
& contains the formula 4Vx O(x)=> 4Q(cg). Applying MP, we obtain 
&/ 7Q(cg); since & is consistent, we have Q (cg) ¢&. By the induction 
assumption, |Q(cg)| = 0 (Q(cg) is closed!). This means that |Q (x)|(g) = 0 
for £€ M if x* = Co, contradicting the assumption that | P| = 1. 

We now suppose that |P|=0 but P € &, and obtain a contradiction. 
Since |P| = 0, for some £ € M we have |O(x)|(€) = 0. Let ¢ be the constant 
term for which x’ = ¢. Clearly ¢ is free for x in Q, so that 0=|Q(x)|(§) = 
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|O (2). Hence Q(t)¢& by the induction assumption, and 4Q(t)€& 
since & is complete. On the other hand, if P € & ie, Wx Q(x) € &, then 
the axiom of specialization Vx Q(x)=> Q(t) gives us & }Q(2). But, since 
0 (t) € &, this contradicts the consistency of & 

(b.) P = 4x Q. This reduces to the previous case using the axiom which 
expresses J in terms of V and negation; we omit the details. oO 


6.10. PROOF OF Lemma 6.6. In order to imbed & in a complete and 
consistent set &’, we shall have to use Zorn’s lemma and the Deduction 
Lemma for L (see Subsection 4.5 of Chapter II). Zorn’s lemma will be 
applied to the set C& = the set of sets of formulas &’ in L which contain 
& and are consistent. The set C& is ordered by inclusion. 


VERIFICATION OF THE HYPOTHESIS OF ZORN’S LEMMA. Let {&{},<, be a 
linearly ordered subset of C&, ie, for any a and B we have either 
6) < &% or &% < &. Then the union U&{ belongs to C&. In fact, 
otherwise U &{ would be inconsistent, and there would exist a deduction 
of a contradiction from a finite number of formulas. Suppose these 
formulas are contained in &; , , &{.. But one of these sets contains the 
remaining n — 1; this set would be inconsistent, contrary to the definition 


of C&. 


PROOF OF LEMMA 6.6 FROM ZORN’S LEMMA. The set C& has a maximal 
element, i.e., a consistent set &’> & such that if Q €&’ then &’U {Q} is 
inconsistent. We claim that &‘ is complete. In fact, suppose that there were 
a closed formula P such that P €&’ and —P ¢&’. Since &’ is maximal, 
it follows that &’U {P}}R and &’U{ 4P}-R for any formula R. By 
the Deduction Lemma, &’} P= R and &’| “P=>R. Using the tautology 
(P= R)=>(( 7P=>R)=R) and MP, we have &’/R, contradicting the 
consistency of &’. O 


6.11. PRooF oF LEMMA 6.7. In constructing a language with a sufficient 
alphabet for a consistent set of formulas &’ which contains & and Ax L’, 
we proceed in the most natural way. 

(a) We add to the alphabet of L a set of new constants whose 
cardinality is that of the alphabet of L + X. We obtain a language L’. 

(b) We consider the set of formulas & U Ax L’ in the language L’, 
where Ax L’ consists of all the logical axioms of L’. We claim that this set 
of formulas is consistent. In fact, if there were a deduction of a contradic- 
tion from & U Ax L’ in L’, then the following procedure would transform 
it into a deduction of a contradiction from & in L: take the finite set 
consisting of all the new constants which occur in the formulas in the 
deduction and replace these constants by old variables (in L) which do not 
occur in the formulas in the deduction. It is easily verified that the 
deduction of a contradiction remains a deduction of a contradiction, and 
now lies entirely in L. 
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(c) We consider the set S of formulas P(x) containing one free variable 
x and such that ~Wx P(x) € & U Ax L’. For each P(x) in S we choose a 
new constant cp subject to the following restriction: each cp can be 
assigned a natural number, its rank, in such a way that if a constant of 
rank n occurs in P(x) then cp has rank >n. This can be done since 
card(S) < card(alphabet of L’) = card(alphabet of L) + Xp. For each P(x) 
in S define the formula 


Rp: Wx P(x)=> 4P (cp) 
and finally let 
&'’=6 UAXL’U{R,|P(x) ES}. 


Call any Rp an R-formula. Note that no R-formula has the form 
Vx P(x), so that L’ is sufficient for &’. It remains only to verify that 6’ 
is consistent. If a contradiction were deducible from &’ then it would be 
deducible using finitely many R-formulas. At least one Rp among these 
must be such that cp does not occur in any of the others: namely, pick cp 
of maximal rank. Hence it suffices to verify that if 6 UAx L’'U® is 
consistent, where & is a set of formulas not containing cp, then the 
addition of Rp does not lead to a contradiction. 

Suppose & U Ax L’U & U {Rp} were inconsistent. Then, in particular, 
we would have a deduction of —R, and, by the Deduction Lemma, 
& UAx L'U RL Rp=> Rp. The tautology (Rp=> 4Rp)=> Rp and MP 
would yield a deduction of —R,; that is, 


& UAx L’ URE A AVx P(x) AP (ep)). 


Then the tautology (P= —Q)=>(Q and MP would yield a deduction of 
P(cp). Transform this deduction by replacing the constant cp with a 
variable y which does not occur in the formulas in the deduction. Since cp 
does not occur in & it is easily verified that the transformation yields a 
deduction of P(y) from & UAxL’U®. Using Gen, & UAx L’U 
RLWy P(y). But since “Wx P(x)€& UAL’, we have & U 
Ax L'L Vy P(y). Hence 6 UAx L’U ® is inconsistent, contrary to 
hypothesis. oO 


6.12. PROOF OF LEMMA 6.8. Let L be a language in the class £,, and let & 

be a set of formulas in L. We imbed & in a complete and consistent set 

&’, and then apply Lemma 6.7 to (L, &’). We let L* and &* denote the 

resulting language and set of formulas. We further define inductively 
(L©, 6) =(L, &); (LE*D, BE+D) = (LO", GO"), 

and finally 


ioe) ioe) 
LO = J L©, 6 = LJ) 6, 
i=0 i=0 


The set 5‘) is consistent, since any deduction of a contradiction would 
be obtained “at some finite level,” and all the 6 are consistent. It is 
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complete, since every closed formula in L‘~) is written in the alphabet of 
L® for some i, and &¢*” contains the completion of 6 in L®. Finally, 
the alphabet of L‘~ is sufficient for 6‘) by the same argument. 


This completes the proof of the lemmas. CO 


6.13. DEDUCTION OF THEOREM 6.2 FROM THE LEMMAS. Let T be a Gédelian 
set of formulas in L. Applying Lemma 6.8 to 7, we imbed (L, T) in 
(L‘™, T‘), where the pair (L‘~), T‘~) satisfies Lemma 6.5. Let @‘™) be 
an interpretation of L‘~) such as must exist by Lemma 6.5. The cardinality 
of M‘*) does not exceed card (alphabet of L) +p. The restriction of 
¢'* to L satisfies the condition T Cc T »L. We prove that T= TL. In fact, 
let PE TL. If P is closed, then P € T, since either P or —P lies in T by 
completeness, and —P ¢ T because P is ¢-true. If P is not closed, and 
X},+..,X, are the variables which occur freely in P, then Wx, P is closed 
and belongs to T. By the axiom of specialization, P is deducible from 
TU {Ax,: + - Vx, P}, so that P € T, since T is closed under deduction. 
This proves the first assertion of the theorem. 

The second assertion follows from the analogous argument applied to & 
instead of T. We find a model @ for &; then & CT,L and T,L is 
Gédelian. im 


6.14. In conclusion, we note that, if the alphabet of L contains a symbol = 
for which the axioms of equality are included in & (or 7), then there exists 
a normal interpretation which satisfies Theorem 6.2 and takes = into 
equality. To prove this, we take the above model M and divide out by the 
equivalence relation ¢(=), as in Subsection 4.6. 


7 Countable models and Skolem’s paradox 


“T know what you’re thinking about,” said 
Tweedledum: “but it isn’t so, nohow.” 
“Contrariwise,” continued Tweedledee, “‘if it 
was so, it might be; and if it were so, it would 
be: but as it isn’t, it ain’t. That’s logic.” 

Lewis Carroll, Through the Looking Glass 


7.1. In this section we discuss the technique of “cutting down” models, in 
particular, models for L,Set. Let L be a language in £,, let M Cc N be two 
sets (or classes in V’), and let @ and wy be interpretations of L in M and N, 
respectively, which are compatible in the obvious sense, so that y is an 
extension of ¢. We have a natural imbedding of interpretation classes 
MCN. 
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7.2. Definition. A formula P in L is called (M, N)-absolute if for all € € M 
we have 


|P| a(€) = [Pl v(S)- 


(We write | |,, instead of | |,, and so on.) 


The property of being absolute is usually used as follows: if P is 
absolute, and is also N-true, then it is automatically M-true. A formula P 
often fails to be absolute for the following reason: a formula P = 4x Q(x) 
can be N-true, so that N has an object with the property Q, but not 
M-true, because no such object lies in M. The proof of the following 
assertion shows how to handle this situation. 


7.3. Proposition. Let & be a set of formulas in L, let w be an interpretation of 
L in N, and let M,C N be a subset. Then there exists a set M, My C M 
CN, having cardinality < card M)+card & + &>, such that all the 
formulas in & are (M, N)-absolute. 


7.4. Corollary (L6wenheim-—Skolem). If the alphabet of L is countable and N 
is a model for &, then N has a countable submodel for &. 


The corollary follows from Proposition 7.3 if we construct a countable 
submodel with respect to which al/ the formulas of L are absolute, and, in 
particular, in which all formulas which were true before remain true. 


PROOF OF 7.3. Suppose the set M, C N, i > 0, has already been defined. Set 
M,.,=M,U {x®|f =&(x, P, 8}, 


where x runs through the variables in L, P runs through the subformulas 
of the formulas in &, and é runs through the points of M,, and where, for 
each fixed triple (x, P, §), &(x, P, §) is any one variation of & along x for 
which |P|,(é’) = 1 if such a variation exists; otherwise the triple does not 
make any contribution to M,, ). 

Further set M= U72,M,. M clearly has the desired cardinality. We 
now show that all subformulas of the formulas in & are (M, N)-absolute. 
We use induction on the number of quantifiers and connectives in the 
formula. The result is obvious for atomic formulas; the inductive step 
when a new formula is constructed using a connective is also clear. The 
quantifier V reduces to 5 in the usual way. 

Thus, suppose P is absolute. We show that 4x P is also absolute. It 
suffices to consider the case when x occurs freely in P. For £& M we 
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have: 


—_— 


if there exists a variation ¢’ € N of £ along x 
[Ax Ply(§) = with | P| y(é’) = 1, 
0, otherwise. 


1, if there exists a variation ¢” € M of é along x 
[Ax Pl y(Q) = with | P| (&”) = 
0, otherwise. 


But the conditions on the right are equivalent. In fact, there exists a 
variation 7» of the point € along variables which do not occur freely in P, 
such that 7 € M, for some i. Then in the case |Ax P|y(€) =|Ax P|y(n) =1 
there is ag E N with |P|y(&) = 1= there is an 7’ € M,,, with |P| y(n’) = 
1, where 7’ is a variation of y along x, by the construction of M,,,. This 
completes the proof. | 


7.5. We now apply Corollary 7.4 to the standard interpretation of L,Set in 
the von Neumann universe V and the set & of Zermelo—Fraenkel axioms. 
We obtain a countable model N for this axiom system, but this model has 
one defect: if X € N, some elements in X might not themselves belong to 
N, 1.e., © is not necessarily transitive. The following result of Mostowski 
shows how to replace N by a transitive countable model. 

Let N C V be a subclass, and let eC N X N be a binary relation. We 
shall write XeY instead of <X, Y> € «. For any X EN we set 


[X] ={¥|Yex }. 


Suppose that [X ]eV for all X¥ € N, i.e., each [X] is a set rather than a class. 
We consider the interpretation @ of L,Set in the class NW for which $(€) is € 
and (=) is equality. 


7.6. Proposition (Mostowski). Suppose that the axiom of extensionality and 
the axiom of the empty set are $-true, and that N does not contain any 
infinite chain --- X,€X,_\€+ + > eX,eXo. Then there exists a unique 
transitive class M Cc V and a unique isomorphism f : (N, &)—>(M, ©). 


If we apply this proposition to the countable model (N, ©) for the 
Zermelo-Fraenkel axioms in subsection 7.5, we obtain a transitive count- 
able model (M, €), that is, a “small universe.” (The condition that all 
e-chains are finite even holds in V, as well as in N; [X] is the subset 
XN CX, and hence is an element of V.) 


7.7. PROOF OF PROPOSITION 7.6. Using transfinite induction, for every 
ordinal a we construct sets VN, C N, M, Cc V and compatible isomorphisms 
fa? (Nas Ely.) >(M,, E|a,), and we show that UN, =N. 
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(a) Since the axiom of extensionality is ¢-true and (=) is equality, we 
easily obtain X, = X,<[X,] =[X,] for all X,, X, E N. Let O@y € N be the 
interpretation of the constant @ of the language L,Set. Since the axiom of 
the empty set is -true, we may conclude that @, is the unique element of 
N for which [Oy] = @ € V. We set 


No = {Oy}, M,={@}, fo(Ov) =O. 


(b) Recursive construction. Let a be an ordinal. Suppose that N,, M, and 
f,, have already been constructed. We set: 


Nya ={X EN|LXJCN,AX EN,JUNS 


ea (X) ={h(YIY E[X]}, for X © Ny4,\N,; fasiln, =, 
M4), =image of f,,, = range of f,,)- 


a 


If 8 is a limiting ordinal, we set Ng = U,-gNaMg = U,epM. and 
fa = Uyeg Su Finally, we set M= U M, and f= U f,, where the union 
is taken over all the ordinals. 

(c) Inductive proof. We verify that for each a 


(c,) N, is a set, i.e., N, € V. 

(c,) M, is a transitive subset of V. 

(c3) f, is an isomorphism of N, with M, taking € to €. 
(c,) N= U Ny. 


Assertions (c,)}{c3) are obvious for a = 0. If they hold for all a < B and if 
B is a limiting ordinal, then they also hold for 8. It remains to check the 
step from a toa +1. 

(c,) [ ] is obviously a function from N,,,\N, to P(N,); since the 
axiom of extensionality is true, there exists an inverse function. Its image 
N,+1\N, is a set, since N,, and therefore P(N,), are sets by the induction 
assumption. 

(c)) Any element in M,,,\M, has the form {f,(Y)|Y €[X]}, where 
X EN,4,\N,. But then [X] Cc N,. Hence, an element f,(Y) of this ele- 
ment of M,,,\M, belongs to the image of f,, i.e., to the set M, C M,.1- 
This proves the transitivity of M,,\. 

(c,;) We first verify that f,,, is a bijection. The surjectivity is obvious; 
using the induction assumption, we see that it suffices to verify injectivity 
on N,,,\N,. But if X), X.€ Naa: \N, and fi41(X1) = f.41(X2), then 


{fOIY €[ XJ} = {AY €[X]}. 


Since f, is injective, we obtain [X,] =[X,], so that X, = X. 
We then find: 


YeXo VY E[XJOL (VY) Chea (X), 


so that for ¥ E€ N,,,\N, the relation YeX goes to f,,,(Y) © fy4(X). This 
is clearly sufficient to complete the induction. 
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(c,) Finally, we verify that VN = U N,. Let N’=N\ U N,; we suppose 
that N’ is nonempty and show that this leads to a contradiction. If there 
existed an X € N’ such that [XY] N’ =@, then we would have [X]M N 
CUN,; then [X]C N,, for some ag, so that X € N,,+1» contradicting the 
assumption that X¥ € N \ U N,. On the other hand, if we had [X,]q N’ # 
© for all X, € N’, then, successively choosing X,,,, €[X,] MN’, we would 
obtain an infinite chain X,, , ,eX,eX,_,€ .. . eX», contradicting the hypothe- 
sis of the theorem. 

(d) Suppose we have two transitive subclasses M and M’, and an 
isomorphism g : (M, €)>(M’, €). We set M,=V,™M and M/= V, 
M M’. An obvious induction on a then shows that g is the identity map. 
The proposition is proved. O 


7.8. Skolem’s paradox. Let M be a transitive countable model for the 
Zermelo—Fraenkel axioms. Then the following formulas are M-true: 


the axiom of infinity; 

the power set axiom; 

Cantor’s theorem that there is no mapping of x onto P(x) for any set x 
(this theorem is deducible from the Zermelo—Fraenkel axioms). 


Since ?(X) is uncountable when X is countably infinite, the content of 
the assertion that the power set axiom is true in the countable model M 
must be very different from the content of the assertion that this axiom is 
V-true. In fact, in L,Set let “y = P(x)” be abbreviated notation for the 
formula Wz(“z C x"<z Ey). Let £6 M, x§ =X EM, and y= YEM. 
Then we easily see that 


[ty = P(x)" yO) = le ¥=(ZIZCXAZEM), 


ie., P(X)y = P(X) A M plays the role of P(X) in M. Here P(X), is at 
most countably infinite, since M is countable; so, from the usual point of 
view, there exists a mapping of a countably infinite set ¥ onto P(X) y. 
This does not contradict Cantor’s theorem, because the M-truth of 
Cantor’s theorem merely means that there are no (graphs of) such map- 
pings in the model M. Such graphs may exist outside of M, but, if we add 
such a graph to M (along with everything that must be added for the 
axioms to remain true), we thereby increase M, and at the same time 
P(X), and the mapping stops being onto. 

All such ways statements of set theory change their meaning in count- 
able models are customarily referred to as Skolem’s paradox. 

Cohen was the first who was able to use the properties of countable 
models to prove the nondeducibility of the continuum hypothesis. In his 
models sets of “M-intermediate” cardinality lie between wy and 9 (wo) sy, 
although from an external point of view both wy and #(w,),,, along with 
all the other sets, are simply countable. Cohen introduced fundamentally 
new ideas of relativizing the very notion of truth, and it is only with the 
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benefit of hindsight that we can so easily understnad the situation in his 
models. For details, see Chapter IIT. 

Skolem himself, and other specialists on the foundations of mathemat- 
ics, were willing to work with countably infinite sets, but not with larger 
infinities. They considered Skolem’s paradox to be a manifestation of the 
relative character of set theoretic concepts. In particular, they considered 
that there exist “different continua” ?(wo),,, none of which coincides with 
the “real” & (wo). 

From the point of view of the topologist or analyst, for whom the 
continuum is a working reality, the existence of countable models means 
that formal language has limitations as a means of imitating intuitive 
reasoning. We encountered similar limitations when discussing the formal 
axioms of induction in §4. 

For the psychologist or philosopher, perhaps the most interesting aspect 
of the situation is that any mathematician can understand the viewpoint of 
another mathematician (without having to agree with it). This means that 
what mathematician A says, though demonstrably incapable of conveying 
unambiguous information about the continuum, nevertheless is capable of 
bringing the brain of mathematician B to the point where it forms an idea 
of the continuum which adequately represents the idea in A’s brain. Then 
B is still free to reject this idea. 

“I know what you’re thinking about,” said Tweedledum: “but it isn’t so, 
nohow.” 


8 Language extensions 


8.1. In this section we study the formal version of “introducing new 
notation.” Here we only consider names of new functions and constants 
which are “demonstrably definable” in the language. Adding such names 
to the alphabet shortens formulas and formal deductions, but does not 
increase the set of deducible formulas—this will be the fundamental 
theorem of this section. 

Of course, in practice, abbreviated notation and well-chosen new names 
can immediately make accessible to our intuition entire areas of mathe- 
matical facts that were previously inaccessible. One of the best known 
examples are the groups introduced by Galois to study equations. In 1924, 
commenting on the attempt to curb the inflation in Germany by introduc- 
ing a new unit of currency, the Rentenmark, Hilbert remarked skeptically: 
“A problem cannot be solved by renaming the independent variable.” But, 
as his biographer Constance Reid noted, Hilbert was wrong: the economic 
situation gradually stabilized. 

We start with the following data. 


8.2. Let L’ be a language in £, with equality and with an infinite set of 
variables, and let P’(x) be a formula in L’ in which x occurs freely. We 
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recall that the abbreviated notation 4!x P’(x) (read: “there exists a unique 
x with the property P’”) stands for the formula 


Ax P(x) \Wx Vy(P'(x) A Py) =x =y). 


Let &’ be a set of formulas in L’ which contains Ax L’, the axioms of 
equality, and perhaps some special axioms. Suppose that the formula 
S!x P(x, y),..-,¥,) 1s deducible from &', where P’ has no free variables 
other than x, y,,...,¥,- Intuitively, this means that P’ defines x as an 
implicit function of y,,...,y,, and in the informal text we can introduce a 
new notation for this function, say, x = f(y,,...,¥,), and then always use 
that notation. Now we give the formal version of this procedure. 


8.3. Proposition. Under the conditions in 8.2, let L denote the language in ‘2, 
whose alphabet is obtained from the alphabet of L' by adding a new 
operation symbol f of degree n if n > 1, or a constant f ifn =0. Let & be 
the smallest set of formulas in L containing Ax L, the axioms of equality, 
&’, and the formula P'(f (9, -- +5 ¥_)s Vis ++ > Yn) 

Then there exists an explicitly describable map from the set of formulas 
of the (richer) language L to the set of formulas of the (poorer) language 
L’ which correlates with each Q a translation Q' and which has the 
following properties: 


(a) If f does not occur in Q, then the translation of Q coincides with Q. 

(b) If Q is deducible from & in L, then Q' is deducible from &' in L’. In 
particular, the set of formulas in L’ which are deducible from &' in L’ 
coincides with the set of formulas in L which do not contain f and are 
deducible from & in L. 


PROOF. 

Translation of formulas. Suppose n 2 1. (The case 2 =0 is analogous, 
and is simpler, so we shall omit it.) The first effect of adding f is to increase 
the set of terms: Z includes terms of the form f(t,,...,¢,), where f can 
occur in f,...,¢,, and so on. In order to decrease the number of 
references to f, we must say “f(7,,...,¢,)” in a roundabout way: “that x 
for which P(x, t),.... 4,).” This is the basic idea behind the translation of 
formulas. We now give a precise inductive definition. 

(a) A term f(t,,...,¢,) is called a simple f-term if f does not occur in 
ere ae 

(b) Let @ be an atomic formula in L. If f does not occur in Q, we let Q 
be its own translation. If f occurs in Q, then there exists a simple f-term 
f(t;,...,¢,) which occurs in Q. We take the very first occurrence of a 
simple f-term in Q, then take a variable symbol x which does not occur in 
Q, substitute it in place of this occurrence, thereby obtaining a formula 
Q*, and finally construct the formula 


Oat St xs SA Ore). 
70 


8 Language extensions 


We apply this procedure to Q/,) to obtain Q(, and so on. After a finite 
number of steps we obtain a formula Q/,, = Q’ in which f does not occur. 
This Q’ is the translation of Q. 

(c) If Q is not an atomic formula, it has the form —Q, or Q, + Q) 
(where * is a connective), or else Vy Q, or Ay Q). In all cases Q is 
translated automatically using the translations of Q, Q), Q;, ie, by 
“adding prime” to the component parts. 

Translation of deductions. The problem is the following: Let Q,,..., Q, 
= Q be a deduction of Q from &, and let Q’ be the translation of @. We 
must construct a deduction of Q’ from &’. The most obvious idea is to 
write the sequence of translations Q;,..., Q;. Why isn’t this a deduction 
of Q’ from &’, since MP and Gen are translated in a trivial way, and 
tautologies are translated as tautologies? Because, for example, the logical 
axiom Vx R(x)=>R(f) might appear in this sequence, and this formula 
stops being an axiom after it is translated, if f occurs in R. Hence, we must 
fill in the sequence Qj, ..., QO, by adding deductions from &’ of certain of 
its terms. This is a rather cumbersome combinatoric procedure, which one 
can read in §74 of Kleene’s book Introduction to Metamathematics (Van 
Nostrand, New York—Toronto, 1952). (The moral of the story is that new 
notation really does economize on time and space.) 

Instead of using this procedure, we shall give an ineffective proof that 
&’+Q’ using the deducibility criterion in 6.3. We state this criterion once 
more: 


(a) If Q’ is true in any model of &’, then &'| Q’. Since &’ contains the 
axioms of equality, we can slightly strengthen this as follows: 

(b) If Q’ is true in any normal model of &’ then Q’ is true in any model of 
&'. 


Recall that = is interpreted as equality in a normal model. On the other 
hand, in §4 we showed that in any model = is interpreted as an equiva- 
lence relation which is compatible with the interpretation of all the 
constants, functions, and relations. Factoring out by this equivalence 
relation leads to a normal model, in which the truth values of all the 
formulas remain as before. 


(c) The normal models of &' (in the language L’) coincide with the normal 
models of & (in the language L). 


More precisely, we can give the following natural one-to-one correspon- 
dence between them which preserves the truth function. We shall limit 
ourselves to the case n > 1. Let ¢ be a normal interpretation of L’ in M for 
which |Q’|, = 1 for all Q’ € &’. In particular, since &’|-3!x P’, we have 


|Alx P'(x, y+ Madle = 1. 
Computing the truth value on the left at a point £@ M and using the 
normality of the model, we then find that to every n-tuple <y,...,»S5€ 
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M” there corresponds a unique x* € M such that |P’(x*, yf,....yD[, = 1 
(this is not the standard notation, but the meaning is clear). We now 
interpret the symbol f (which is the new symbol in the language L) as the 
function M"—» M which takes <y§,..., y=) to x. We obviously obtain a 
normal model for & in L. 

Conversely, any normal model for & can be restricted to L’ to obtain a 
normal model for &’. 


(d) If Q is deducible from & in L, then Q’ is true in any normal model for 
&’. 


PROOF. Q is true in any model ¢ for &. To prove that Q’ is true, we begin 
with atomic formulas Q which contain f. In the notation in the first part of 
the proof (translation of formulas), we construct Q* and then Qn = 
Ix(P(x, t,.-.,t,)A\Q*(x)). To verify that |Q(|, = 1, for each point 
€€ M we must find a variation &’ of € along x for which 


|P|,(€) =1 and |O*(x)[g(&) =. 


We determine x* from the condition |P(x*, tf, .. ., t§)|, = 1. The descrip- 
tion in (c) of the interpretation of f shows that we now have |Q*|,(¢) = 
1Q|,() = 1. 


Thus, truth is preserved in going from Q to Qj. Repeating this 
procedure, we find that Q’ is true for atomic formulas Q. Finally, the truth 
of Q’ in the general case is proved by induction on the number of 
connectives and quantifiers. Combining the results (a}(d), we then obtain 
&’bt» Q’, which completes the proof of Proposition 8.3. Ey 


8.4. EXAMPLES 

(a) In L,Set the following formula is deducible from the axioms of 
extensionality and pairing (and also the axioms of equality and the logical 
axioms): 


Alix Vz(z Exez=u\/z=0). 


Using Proposition 8.3, we see that we may add to L,Set a new degree 2 
function symbol { }, “unordered pair,” without changing the set of for- 
mulas in L,Set which are deducible from the Zermelo—Fraenkel axioms. 
Therefore, without hesitation we may use not only the abbreviated nota- 
tion “x = {u, w}” as before, but also terms which are put together using 
the symbol { }. In particular, (here the use of { } is not normalized, but is 
in agreement with tradition): 
(b) We can introduce notation for the finite ordinals 


@, {OD}, {D, {D}},--- 


as terms in their own right in our language extension, and then imbed 
formal arithmetic in formal set theory. 
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(c) After deducing the formula 
3!x(“x is an ordinal” (“x is not finite” “VW ordinal y < x, y is finite”) 


from the Zermelo—Fraenkel axioms, we can introduce a new constant wo, 
and then continue to introduce names of more and more ordinals which 
are demonstrably uniquely characterized by formulas in L,Set (or in 
language extensions which are formed in the same way). 

We shall make use of this new freedom of action in Chapter III. 


9 Undefinability of truth: the language SELF 


9.1. When modeled in formal languages, arguments of the “Liar Paradox” 
type lead to important theorems on the limitations of the modes of 
expression and proof in these languages. The best known of these theorems 
are Tarski’s theorem on the undefinability of the set of true formulas and 
Gédel’s theorem on the impossibility of effectively axiomatizing 
arithmetic. 

The next three sections are devoted to Tarski’s theorem. Our presenta- 
tion is based on an excellent article by Smullyan (Languages in which 
self-reference is possible, J. Symb. Logic. vol. 22, no. 1 (1957), 55-67). 

In this section we describe the extremely elementary language SELF 
(which does not belong to £,), which was designed to illustrate self-refer- 
ence and which graphically demonstrates the idea of such a construction. 
In §10 we introduce the language SAr, which is just as expressive as L,Ar, 
but does not belong to £,. Its syntax is close to that of SELF, which 
greatly simplifies proofs. Finally, in §11 we use a method of Smullyan to 
prove Tarski’s theorem for SAr. 


9.2. The language SELF (Smullyan’s Easy Language For self-reference) 

The alphabet of SELF. E, * (symmetric quotes), r (relation of degree 
1), — (negation). 

The syntax of SELF. The distinguished expressions are: labels, dis- 
plays, formulas, and names. The /abel of any expression P is * P* (“P 
in quotes”). The display of any expression P is P* P* (“something 
with a label”). Formulas are expressions of the form rE... E* P* or 
rE ...E+* P+, where E appears k >0 times after r. We use the 
abbreviated notation rE“ * P* and —rE* * P * for formulas. Finally, 
we introduce the binary relation “is the name of” on the set of all 
distinguished expressions. This relation is defined recursively: 


(a) The label of P is a name of P. 

(b) If P is a name of Q, then EP is a name of the display of Q, i.e., a name 
of the expression 0 * QO +. 

9.3. Remarks 


(a) If P is a name of Q, then the display of Q has at least two different 
names: EP and *Q*Qv+* *. Thus, an expression can have several 


73 


II Truth and deducibility 


names. But, conversely, an expression is uniquely determined if we know 
its name; names all have the form E* + P * , k > 0. We shall write N(Q) 
in place of “one of the names of Q.” 

(b) Every formula has the form rN(Q) or —1rN(Q). In 9.4 we interpret 
such a formula as the statement, “The expression Q has (or does not have) 
the property R,” and it is natural that the formula, in saying something 
about Q, “calls Q by name.” 

(c) The expression E * E * is one of two possible names for itself. In 
exactly the same way, the formula rE * rE * “says something about itself” 
(see 9.5). The language SELF was constructed precisely in order to 
produce these effects of self-reference with the fewest possible modes of 
expression. 


9.4. The standard interpretations. In order to give one of the standard 
interpretations of the language SELF, we choose any set (property) R of 
expressions of the language and introduce the truth function | |g on the 
formulas by stipulating 


1, if QER 
1=|>N(Qie=lNOla={P BOE. 

0, otherwise. 
We say that a formula is R-true (R-false) if the value of | |p on the 
formula equals | (resp. 0). 


9.5. Undefinability Theorem. For any property R 


R-true formulas 
Be ornauas 7 R-false formulas. 
PROOF. 

(a) The formula Q= rE * —rE* is R-truees@rE + TrE + is R- 
false O € R, since E* rE * is a name of the display of “rE, ie., a 
name of Q. Thus, Q cannot both lie in R and be true, which proves the 
first part of the theorem. The connection with the Liar paradox becomes 
clear if we note that O says about itself: “I do not have the property R.” 

(b) Analogously, the formula rE * rE * says about itself: “I have the 
property R,” and so cannot both lie in R and be R-false. oO 


10 Smullyan’s language of arithmetic 


10.1. In this section we describe the language of arithmetic SAr and its 
standard interpretation. The main difference between SAr and L,Ar is that 
in SAr we are allowed to form “class terms”—names of certain sets of 
natural numbers. More precisely, if P(x) is a formula in SAr with one free 
variable x, then the expression x(P(x)) in SAr names the set {x € N|P(x) 
is true}, and the expression x(P(x))k, where the term k is a name for an 
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integer k > 1, is a name for the statement “k satisfies P.” The greater 
richness of the modes of expression in SAr, as opposed to L,Ar, does not 
increase the class of subsets in U,,,N’ which are definable by formulas. 
But it brings the syntax of SAr so close to that of SELF that we can 
imitate the proof of Theorem 9.5. 

In addition, the alphabet of SAr is somewhat altered and shortened in 
comparison with the alphabet of L,Ar, but this is only done in order to 
simplify the description of the syntax. These changes do not make the logic 
of SAr any poorer. 


10.2. The alphabet of SAr: x (a variable); ’ (used to form a countable set of 
variables x, x’, x”,...); - (multiplication, a degree 2 operation); (raising 
to a power, a degree 2 operation, as in Algol); = (equality); | (a 
connective, the conjunction of negations); (, ) (parentheses); and | (the 
constant one). 


10.3. The syntax and interpretation of SAr. Because we are allowed to form 
the class terms x(P(x)) and the formulas x(P(x))k, the syntax is more 
complicated than in languages of £,. We use induction on the integer i > 0 
to define two sequences of sets of expressions: 7m), (terms of rank < 2i) 
and Fi,;,, (formulas of rank < 2i+ 1). (Using double induction—on the 
rank of the term or formula, and, within the set Tm,; or Fl,;,,, on the 
length of the term or formula—one can prove a unique reading lemma; 
this lemma is the basis for defining free and bound occurrences of 
variables and truth functions. However, since there is nothing new here 
beyond what was done in §1, we leave the details to the reader.) 

Along with our description of the syntax, we give a parallel description 
of the standard interpretation of SAr in N. In order to interpret expres- 
sions with free variables, we must fix a point €@NY=NXNX..., 
which we shall identify with the corresponding infinite vector with natural 
number coordinates. Here the value of the kth variable (x’""'’)f (k—1 
primes) is in the Ath place in the vector. 

(ap) Tmo is the set of numerical terms i.e., the least set of expressions 
which contains the variables x, x',x”,... and the names of the natural 
numbers 1, 11, 111, ... and is closed with respect to forming the expressions 
(¢,) + (4) and (t,)Mt,), where t, © Tmo. iy = 

Instead of x’’"’’ (k — 1 primes) we shall write x,, and instead of 1... 1 
(k > 1 ones) we shall write k. The term k is interpreted as Ak (not 
depending on &); x{ is interpreted as the kth coordinate of £; and, if 
tf, 18 N have already been determined, then [(7,)-(/,) = ¢ée§ and 
[(1,) Nt.) = (28). The occurrences of the expressions x, = x’'''’ in any 
term in 7m, are obviously independent of one another. All such oc- 
currences are considered free. 

(bo) Fl, is the least set of expressions which contains all expressions of the 
form t, = t, (where t;€ Tmy) and is closed with respect to forming the 
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expressions (P,)|(P,), where P, € Fl,. In other words, Fl, is the logical 
closure of the set of atomic formulas {t, = t,|t, © Tmo}. 

Choosing a point £ determines a truth value for any formula P € Fl, by 
induction on the number of times | occurs: 


if & = £6 
ncaa [tae 


0, otherwise; 


Keene = {bh HIPIG) —lFIG)=0 


otherwise. 


All occurrences of variables in elements of F/, are independent of one 
another, and are considered free. 

Now let i > 1, and suppose that the sets Tm,_>, Fl,,_, are already 
defined for k < i along with the interpretations and the division into free 
and bound occurrences of variables. We define the next sets Tm), and 
Fl,;,, aS follows. 

(a;) Tm, consists of the class terms of rank < 2/: 


Tm,_,U {x,(P)|k > 1, P € Fla;_, } 


(Tmy need not be included when i = 1). These elements have the following 
interpretation: 


x?|é runs through the variations of € along x, 


fc 
CHET No eshich |P\(¢) =1 


All occurrences of the variable x, in x,(P) are considered bound, and the 
occurrences of other variables remain the same (free or bound) as in P. 
(b,) Fl,;,, is the logical closure of the set of expressions 


Fly,_,U {x,(P) = x,(Q)|k > 1; P,Q € Fl,_, }U{ THA > 1, T © Tm,}, 


The truth function is defined as follows: if we set x,(P) = T, and x,(Q) = 
T,, then 


j €— Ts 
|x,(P) = x,(Q)\(Q) = | 1, if 7} = T$ as subsets of N, 
0, otherwise; 


rey tl, ifk ETS, 
[TEIE) 0, otherwise. 
The function | | is extended to the logical closure in the same way as in bo). 
All occurrences of variables in x,(P) = x,(Q) and in Tk are the same (free 
or bound) as in the corresponding class term. Composition using the 
connective | does not change the nature of the occurrence. As in subsec- 
tion 2.10, one can prove that |P|(€) only depends on the é-values of the 
variables which have free occurrences in the formula P © U 2 9 Flo; 41. 
This finishes the description of the syntax and semantics of SAr. 
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In conclusion, we show that the classes of sets in U,.,,N” which are 
definable by formulas in L,Ar and in SAr coincide. This result is not used 
in the proof of Tarski’s theorem in the next section. However, the result 
itself and the method of proof are instructive, and we shall return to these 
ideas in Part III of the book. 

Let L,Ar have a countable set of variables. If we denote them by 
Xj, Xz,...,X,,... and identify x; with x’’’’’ (i— 1 primes), we can also 
identify the interpretation classes for L,Ar and SAr in the obvious way. 
Our claim that the classes of definable sets coincide is then an immediate 
consequence of the following stronger fact: 


10.4. Proposition. Two translation mappings 
{formulas of L,Ar} = {formulas of SAr} 
can be explicitly defined with the following properties: 


(a) At every point & the truth values of any formula and its translation 
coincide. 
(b) The sets of free variables of any formula and its translation coincide. 


We note that the mappings we define will not be inverse to each other! 


PROOF. 

(a) The translation from L,Ar to SAr. The translation of a formula P will 
be denoted “P”. We first translate atomic formulas, and then use induc- 
tion on the length. The alphabet of SAr does not have addition, but it has 
both multiplication and raising to a power, so that in place of z = x + y we 
can write 2? = 2*- 2”, 

(a,) Atomic formulas. They have the form t, = t,. By “carrying out the 
operations,” we replace every nonzero term in L,Ar by a “normalized 
term,” ie., a polynomial of the form ¥x;'- - - x», where the monomials 
are written in the form (...(x,:x,)* ...°+X,)°x3)...), then arranged in 
lexicographical order, and finally separated by parentheses: (... ((m, + 
m)) +m,)+ ...). It is clear how to correlate such a term ¢ to the term 
“200? ij in SAr. For example, “20((x1)° (x1) + X,)” is (20x): (x,))- (21(x>)). 
By definition, the translation “270” is 1. Then we define the translation of 
the formula /, = /, to be “277,” =“2f7,.” It is clear that such a formula and 
its translation have the same variables and are true at the same points &. 

(az) If “Q,” “Q),” and “Q,” have already been defined, then “ —Q” is 
defined as “Q”|“Q”. We similarly construct “Q, * Q,” for the other 
connectives (see “Digression: Syntax” in Chapter I). 

(a3) If “Q” has already been defined, then “Wx, Q” is defined as 


x (“O”) = x4 (%y =X). 


Both the formula and its translation are true at a point € if and only if O 
(and “Q”) are true at all variations é of € along x,. They also have the 


77 


II Truth and deducibility 


same free variables, since, by induction, we may assume that this is the 
case for Q and “Q.” 

(a,) By definition, “Ax, Q” coincides with “ “Vx, 4Q.” 

(b) The translation from SAr to L,Ar. As before, we let “P” denote the 
translation of a formula P, although this time P will be a formula in SAr 
and “P” will be a formula in L,Ar. 

There is a subtle point here, namely, how to translate x, = x,f x3. It will 
be shown in Part II of the book that such a translation exists, and can even 
be taken in the form 4x,--° + 3x, p(x}, Xz, Xz, X4-. +, X,), where p is an 
atomic formula in L,Ar. Here we shall take this fact on faith, and choose a 
translation “x, = x,fx3” once and for all. 

(b,) Translation of formulas in Fly. The following rules give an inductive 
definition: 

“1, =,” has exactly the same form if ¢,, t, © {variables} U 

{1, 1l,...} (of course, in the sense that x’’’’’ is replaced by x, and 
1--+1 is replaced by (---(1+1)+1)+---)). “x, =t):t)” has the 
form Ax, 4x,(“x;, = 1 A“x = "Ax, = x)°%) and “x, =fPt,” has the 
form Ax, 4x,(“x; = 0)" A\“x; = 1” Ax, = x,1%;"), where x; and x, are the 
first two variables not occurring in ¢, or ¢,. We similarly translate formulas 
with the left and right-hand sides permuted, and also with 1 - - - 1 instead 
of x,. We further stipulate that “t, = ¢,” has the form 4x,(“x; = 4”? A°x; = 
t,”), where x; is the first variable not occurring in ¢, or ¢,, and where we 
only assume that neither ¢, nor ¢, is a variable or 1- - - 1. It is clear that 
the truth function and the set of free variables are preserved under these 
translations. 

(b,) Suppose that the formulas in F/,,_, have already been translated. 
Let 


“x, (P,) = x,(P2)” be Vx, (“P,"o"P,”), and 
“*x,(P )n” be “P”(n), 


where on the right #=(- - - (1+ 1)+1)+- - - ) is substituted in place of 
all free occurrences of x, in “P.” This completes the proof. oO 


11 Undefinability of truth: Tarski’s theorem 


11.1. The language SAr is interpreted in N, and not in the set of its own 
formulas the way SELF is. In order to be able to determine the set of 
definable formulas, we number formulas by (certain) integers as follows. 

We number the symbols of the alphabet (of which there are nine) from 
1 to 9 in any order, as long as 1 corresponds to 9. We then set (here 
a; € {alphabet of SAr} and v(a,) is the number of a,): 


k 
number (a, + - + a,) =n(a,°-° + a) = > v(a)l0%-!+ 1. 
1 


i= 
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In other words, we obtain the number of an expression by replacing all of 
its symbols by the corresponding decimal digits (1 is replaced by 9), then 
reading the resulting number in the decimal system and adding 1. It is 
clear that an expression can be reconstructed in a unique way if we know 
its number. 

_ The name in SAr of the number of an expression P, Le., 
1---+ I(n(P) times), is called the /abel of P. As in SELF, we shall denote 
the label of P by * P * (but now this is abbreviated notation). We call the 
expression P + P * the display of P. 


11.2. Definition. Let P(x) be a formula in SAr with one free variable x. 
(a) An expression Q satisfies P if the number of Q lies in the set 
{k|P(k) is true}. 
(b) An expression Q is displayed in P if the display of Q satisfies P. 


11.3. Lemma. Let P(x) be as in 11.2. Let P(x) denote the formula 
P((x)- ((10)t(x))) (i.e. the term “x10*” is substituted in place of all free 
occurrences of x). Then the set of expressions satisfying P,, coincides with 
the set of expressions displayed in P. 


Proor. If Q has number k, then the display of Q has number k- 10* 
(which is why | has number nine!): 


nO O« Je n(O Tes d 3) 
n(Q) times 


=(n(Q)-—1)10%2)+ 9---9 +1=n(Q)10"™®). 
n(Q ) times 
Hence, n(Q) satisfies P_if and only if n(Q * Q * ) satisfies P. oO 


11.4. Theorem. For any formula P(x) as in 11.2, we have: 


the set of true formulas 


the set of formul lisfying P 
f formulas satisfying P # the set of false formulas. 


Proor. We consider the Tarski-Smullyan formula S: xP, + xP, *. 
According to the definitions, we have (recall that xP, is a class term and 
* xP, * is the name of a number): S is true@ xP, satisfies P- = xP, is 
displayed in P (by Lemma 11.3)<the display of xP, satisfies PoS 
satisfies P. Hence, S is either not false and satisfies P, or else is false and 
does not satisfy P. Therefore, the set of formulas satisfying P cannot 
coincide with the set of false formulas. As in §9, the formula S says, “I 
satisfy P.” 
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Similarly, the formula 
x((P)UP))_ * x((P UP ))e * 


says, “I do not satisfy P,” and thus either satisfies P or is true, but not 
both. The theorem is proved. qo 


11.5. Of course, Lemma 11.3 is pure magic. The decimal system really has 
nothing to do with all this, and 1 did not really have to be number nine, 
but this way everything is much prettier. 

More generally, let ?Ar be any language of arithmetic with a finite 
alphabet containing the alphabet of SAr. Let the rules for forming dis- 
tinguished expressions and the standard interpretation of formulas in ?Ar 
be an arbitrary extension of the rules in SAr. We only require that the 
terms and formulas in SAr keep their earlier meaning, and that, for any 
formula P(x) in ?Ar with a free variable x, the expression x(P(x))k must 
be a formula in ?Ar and be interpreted by the same recipe as in SAr. (For 
example, we might add to SAr the + sign, the connectives, and the 
quantifiers, and then allow formulas to be constructed by the rules of £, as 
well, thereby imbedding L,Ar in ?Ar.) 

Then the Undefinability theorem 11.4 holds for ?Ar. 

We must choose the numbering as follows: if m is the number of 
elements in the alphabet of ?Ar and » is a numbering of the symbols for 
which v(1) = m, then 


k 
n(a,°°- &)= > »¥(a)(m+ 1° + t, 


i= 


Then, using the same conventions as before, we have 


n(Q*Q*)=n(Q 1---1 ) 
n(Q ) times 
n(Q)-1 
=(n(Q)-1)(m+ 1)? +m SY (m4 1/41 
j=0 


=n(Q)(m+ 1)?” 


Defining Pp(x) as P((x)-:((m+ 1)f(x))), without any further alterations 
we obtain Lemma 11.3 and Tarski’s theorem for ?Ar. 


11.6. Remarks 

(a) If Tarski’s theorem were not true, and there were a formula P(x) 
such that {Q|Q is a formula and P(n(Q)) is true} coincided with the set 
of all true formulas of arithmetic, then this would mean that all number 
theoretic questions would reduce to a series of problems all of the same 
type. Instead of asking, “Is assertion number n true?” we could ask, “Is 
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P(n) true?” Although such an all-encompassing problem could still be 
rather complicated (in a certain sense even “infinitely complicated,” see 
Part III), Tarski’s theorem says that arithmetic has much more diversity 
than could be contained in any such single problem. 

(b) We still have reason to suspect that perhaps everything worked out 
this way because we could “cleverly” number the formulas. This is not the 
case; the results in Part III will imply that Tarski’s theorem remains true 
for any numbering in which a formula and its number can be effectively 
reconstructed from one another. 

(c) It is natural to ask whether the set of numbers of provable, or 
deducible, formulas is definable (for some set of axioms and rules of 
deduction, for example in SAr). The answer is yes this set is definable. We 
shall give some intuitive considerations in this direction, which anticipate 
the systematic theory in Part III. 

However we define the notion of provability, it is natural to expect it to 
have the following property: there exists an algorithm (for example, a 
computer program) which for any text of the given language determines 
whether this text is a proof and, if so, of what formula. 

We now write a program which constructs the texts in the language in 
lexicographical order, verifies whether each one is a proof, and, when it is, 
computes the number of the formula it proves. Roughly speaking, the 
graph of the function (number of a proof)}> (number of the formula 
proved) is definable in L,Ar because machine logic and arithmetic are 
imbedded in L,Ar. Hence, the set of numbers of provable formulas is 
definable in L,Ar, in SAr, or in any language ?Ar as in 11.5. 

Combining this discussion with Tarski’s theorem, we obtain the follow- 
ing form of Gédel’s theorem: 


11.7, Gédel’s Incompleteness Theorem for Arithmetic. Jn any language of 
arithmetic of type ?Ar, and for any definition of deducibility in which the 
set of (numbers of) deducible formulas is definable, 


{ true formulas} + { deducible formulas}. 


In Part III we discuss more general formulations of this theorem and 
other versions of the proof, and we give a detailed verification of the 
principle in 11.6(c) for deductions in L,Ar. 


Digression: self-reference 


In natural languages it is only recently that linguists have taken note of the 
so-called “performative” statements. The characteristic feature of such a 
statement is se/f-reference, which can be defined as the ability to “refer to a 
reality that it creates itself, because it is stated under circumstances which 
make it into an act” (E. Benveniste, La Philosophie analytique et le 
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langage, Les Et. Philos., No. 1 (1963) 9). Examples of performative state- 
ments include: “I solemnly swear,” the saying of which constitutes the act 
of swearing; “I proclaim a general mobilization,” and “I appoint you 
director,” when these two statements come from an authority that has the 
power to carry out the respective acts. If we look carefully at the semantics 
of performative statements, we find an imperative nuance, even though it 
is expressed by the declarative mood of the verb. 

In this connection, it is interesting to compare the role of self-reference 
in formal and algorithmic languages (see also subsection 1.2 of Chapter I). 
In formal languages (and, in general, in descriptive languages), self-refer- 
ence leads to logical circles, to paradoxes, or, if we try to avoid logical 
circles, to demonstrations of certain inadequacies of the language. On the 
other hand, in algorithmic languages (and, in general, in control languages 
and systems), self-reference is the most important device for turning a 
finite program into a process that is potentially arbitrarily long (“loops”); 
it takes part in the control instructions (feedback), and is among the 
fundamental possibilities of the system. 

A similar dichotomy can also be found in psychological behavior— 
compare with the distinction between introspection and self-improvement. 

Finally, self-reference can play a role in the genetic causality of aging 
processes (of biological and social systems). A self-regenerating cycle, 
when repeated many times, leads to erosion at the place of generation. 


12 Quantum logic 


12.1. The last section of this chapter is devoted to certain physical facts 
and to the mathematical constructions which have been developed to 
describe them. In particular, we discuss von Neumann’s theorem that it is 
impossible to introduce hidden variables into the quantum mechanical 
picture of the world. This material, while not completely traditional for a 
course in logic, is relevant here for two reasons. 

In the first place, von Neumann’s theorem is a vivid example of a 
metaphysical assertion. It is concerned with properties of the language, 
rather than with the subatomic world described by the language, and thus 
is analogous to, for example, Tarski’s theorem in metamathematics. This is 
why it occupies an isolated position in physics, and why we are interested 
in it here. 

In the second place, analyzing quantum mechanical phenomena reveals 
a profound divergence between the internal logical structures of the 
macroworld and the microworld. Although explanations of these dif- 
ferences by means of natural language and natural logic are agonizingly 
difficult and, in the last analysis, always leave one feeling unsatisfied, these 
attempts to explain continue. The development of the foundations of 
physics in the twentieth century has taught us a serious lesson. Creating 
and understanding these foundations turned out to have very little to do 
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with the epistemological abstractions which were of such importance to the 
twentieth century critics of the foundations of mathematics: finiteness, 
consistency, constructibility, and, in general, the Cartesian notion of intui- 
tive clarity. Instead, completely unforeseen principles moved into the 
spotlight: complementarity, and a nonclassical, probabilistic truth func- 
tion. The electron is infinite, capricious, and free, and does not at all share 
our love for algorithms. 

The following exposition is based on the article by S. Kochen and E. P. 
Specker in J. Math. Mech., vol. 17, no. 1 (1967), 59-87. Subsections 
12.9-12.16 contain pure algebra and formally do not depend on the 
preceding semi-physical considerations. 


12.2. The atom of orthohelium. We now describe certain characteristics of 
the behavior of the physical system “an atom of orthohelium in the state 
n=2,/=0,5s=1.” Such a helium atom is in an excited state: its two 
electrons are on the second energy level, and their spin is pointed in the 
same direction. Nevertheless, the state is meta-stable, because, in order to 
fall to the first energy level, the electrons must turn their spins in opposite 
directions (parahelium); this creates a certain stability. 

Spin is a physical quantity which is expressed in the same units as the 
“angular momentum.” The total spin of our system (in atomic units: 
h = 27) is represented by a unit vector in physical three-dimensional space. 
As a first approximation we may think of it as changing with time but 
having instantaneous values which can be measured. (The inadequacy of 
this picture will soon be demonstrated.) 

An experiment for the purpose of measuring the instantaneous value of 
the spin of our system could consist of turning on a magnetic field having 
a specified geometry and registering the shift in energy levels (spectral 
lines) of the atom. Each outcome of such an experiment can be precisely 
interpreted as a measurement of the projection of the spin on some axis, 
which is uniquely determined by the geometry of the field. We shall 
identify these directions with points of the unit sphere S?. 

Quantum mechanics makes the following positive assertions concerning 
measurements of the spin of orthohelium. The following quantities are 
measurable: 

(a) the projection s(a, ¢) of the spin in the direction a € S? at the 
moment of time f; 

(b) the lengths |s|(a;, 0), i= 1, 2,3, of three projections of the spin in 
three pairwise orthogonal directions {a,, a, a3} C S? (a “frame”) at the 
time ¢. The predictions concerning the results of these measurements are as 
follows: 

(c) s(a, f) is a random variable which can only take the values — 1, 0, 1. 
(The probabilities of these values can be predicted from the results of the 
previous measurements, but this is not essential for us here.) 

(d) 33_,|s|(a,, 2) = 2 for any frame {a,, a, a3} and any ¢. 
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12.3. Attempt at a classical interpretation. This could consist in adopting the 
following hypotheses A and B: 

A. There is a certain space Q of “hidden variables” or “internal states” 
of the system and a function s(a, ¢; w), w € Q, such that, if the system is in 
the state w at time ¢, then s(a, ft; w) is the “true value of the projection of 
the spin on the a-axis” at this moment. 

B. The probabilistic aspect of the predictions in 12.2(c) results from 
our not knowing the exact values of w= a(t), so that for some measure 
du(w) we have 


mathematical expectation of s(a, t) = f sla, t; w) du(w), 
Q 


and similarly for |s|. 

Generalizing, we might suppose that {2 does not only depend on the 
system itself but also on the arrangement for measuring the spin; ~ may 
depend on the time, and so on. However, all of these possibilities actually 
contradict the predictions in 12.2(c) for the following startling reason. 


12.4. Proposition (Kochen, Specker). There does not exist a mapping S*—> 
{0, 1} such that for every frame (a, a, a3} this mapping takes the value 
zero on precisely one of the directions a;. Moreover, it is possible to 
construct a finite system T C S? of 117 points with the following property. 
For any mapping k : T—»{0, 1} either there is a frame (a), a, a3} CT on 
which k does take the value 0 exactly once, or else there is a pair of 
perpendicular directions {a,, a,} CT on which k equals 0. 


Here we note that adopting both the assertions in 12.2 and the hypothe- 
ses in 12.3 would allow us to construct such a mapping of the sphere. In 
fact, it would be sufficient to consider 


SS? {0,1}: ab |s|(a, t; w) 


for fixed ¢ and w. By 12(c), |s| only takes the values 0 and 1, and, by 12(d), 
it takes the value | twice and 0 once on any frame {a, a, a3}. 

We prove Proposition 12.4 in subsections 12.12—12.15, and now proceed 
to a more systematic study of “‘quantum logic.” We shall adhere to our 
customary and useful dualism between “language and interpretation,” 
although these categories are much less formalized and are harder to 
distinguish from each other in physics. 


12.5. The language of nonrelativistic quantum mechanics. We have a some- 
what unusual situation in that quantum mechanics does not really have its 
own language. More precisely, to describe a physical system S such as a 
“free electron,” “atom of helium in a magnetic field,” etc., quantum 
mechanics uses a certain fragment of the language of functional analysis, 
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“oriented on describing S.” Assuming that the reader is familiar with 
functional analysis, we shall limit ourselves to a glossary of the most 
frequently used terms. We also give some synonyms used by physicists to 
indicate the “physical sense,” i.e., the interpretation, which will be consid- 
ered separately in our text. 

(a) A separable complex Hilbert space K,. Here we are also interested in 
its one-dimensional subspaces and its vectors of length one. A synonym for 
the former is the (pure) states, and for the latter is the (normalized) 
y-functions, or, more precisely, the instantaneous values of the y-func- 
tions. 

(b) Unitary representations of R in Hy: thy U, = e~‘#s'. For synonyms 
we have ¢h> U, is the dynamic group; ¢ is the time; and the infinitesimal 
generator H, (which is a self-adjoint operator) is the dynamic operator, or 
Hamiltonian, of S. 

(c) Schrédinger equation: d,/0t = — iHyy,. It is satisfied by the y-func- 
tions y, = e~ #5‘, which evolve with time. 

(d) Self-adjoint operators in KH,. Synonym: the observables of the 
system. The operator H, is an energy observable. The discrete spectrum of 
Hy gives us the energy levels of S. We shall be especially interested in the 
orthogonal projection observables. Here the pure states C, C Ss are in 
one-to-one correspondence with the projections P, onto the corresponding 
subspace. 

Another important class of projections is constructed using the spectral 
decomposition theorem. Let A = {[”,,A dP, (A). Then the projection P,(U) 
is defined for any Borel subset U CR. In the simplest cases its image is 
spanned by the vectors in SC, which are eigenvectors for A with eigenval- 
ues in U. 

Projection observables are also called “questions” (Mackey) or “Eigen- 
schaften” (von Neumann). 

(e) Commuting operators. Synonym: compatible (or simultaneously 
measurable) observables. For unbounded operators A and B, whose formal 
commutator may have an empty domain of definition, we define com- 
mutativity to mean that P,(U,) and P,(U,) commute for all Borel sets 
U,, U,CR. 

(f) Unitary representations in Ks of various groups, such as SO(3), 
SU(2), S,, etc. Synonym: symmetries of the system S (if the representa- 
tions commute with the Hamiltonian H,), or approximate symmetries (if 
H, = H,+ Hj, where the representations commute with H,, and H, is a 
“small perturbation”). 


12.7. EXAMPLE. Let S be “an electron in the electric field of a proton” 
(where we disregard the motion of the proton, the spin, and the relativistic 
effects). Here: 

3s = L?(E>) consists of the square integrable complex functions in the 
Euclidean “physical coordinate space of the electron.” 
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H, is the self-adjoint extension of the operator 
h 2 


eg Loney, eee ee 
4am hor’ 
where A is Planck’s constant, m is the mass of the electron, e is its charge, 
and r is its distance from the origin (where the proton is). 

The energy levels (the discrete spectrum of H,) are: E, = 
—(2n*me*/h*)/(1/n*), n=1,2,3,... . The eigenfunctions y corre- 
sponding to the points of this spectrum are the states of an electron in a 
hydrogen atom. The energy level n = 1 corresponds to the unexcited state, 
and the other values of n correspond to excited states. The positive 
semiaxis is the continuous spectrum of H,; in states with positive electron 
energy, “the hydrogen atom is ionized.” 

The most important observables of the electron are: the operators of 
multiplication by the three coordinate functions x, (the coordinate observ- 
ables), and the self-adjoint extension of the operators p, = (h/27i)(0/0x;,) 
(the momentum projection observables). The operators x, and p, do not 
commute, so that the x,-coordinate and the projection of the momentum 
on the x,-axis are not simultaneously measurable. 

The system S is spherically symmetric. The natural representation of 
SO(3) in L*(E*) commutes with H,. The restriction of this representation 
to the subspace of SC, corresponding to the discrete spectrum of H, in a 
natural way splits into a direct sum of representations corresponding to a 
given energy level E,. This E,-subspace, in turn, splits into a direct sum of 
representations of SO(3) on spherical polynomials of degree j = 
0, 1,2,...,2—1 with multiplicity one. If the ~-function of the electron 
belongs to the level E,, and the subspace corresponding to the representa- 
tion of SO(3) on spherical polynomials of degree j, we say that n andj are 
the principal and orbital quantum numbers, respectively, of the electron’s 
state in the hydrogen atom. 


The above text is typical of what might be found in a physics textbook. 
The “language” is mixed with the “metalanguage” which gives the stan- 
dard interpretation of the language. We now describe them separately and 
more systematically. 


12.8. The interpretation. A very important aspect of the interpretation 
which we shall not discuss here is the list of informal recipes for choosing 
SCs, Hs, and the observables corresponding to a given system S. These 
“units of expression” are often chosen in two stages: a classical description 
is chosen, and then the “rules of quantization” are applied to it. This 
procedure might be “approximate” in the sense that certain circumstances 
are not taken into account (such as the spin in 12.7). 

Suppose that ‘{, and Hy, have already been chosen. The most char- 
acteristic peculiarity of the interpretation of quantum language is that it is 
“two-layered.” Part of the mathematical statements are interpreted as 
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assertions about a “freely evolving system,” and part are interpreted as 
assertions about the results of observations on this system. 

(a) Freely evolving system. It is generally believed that the system’s 
y-function y, € IC, gives (within the framework of a given approximation) 
maximally complete information about the state of the system at time ¢. As 
long as no one looks in on the system, y, evolves as e~‘”*‘f,, starting from 
the initial state Yj). (How do we know yp? See subsection 12.8(c) below.) 

(b) Observation. Suppose we want to measure the instantaneous value of 
some physical quantity for our system S at the moment ¢. This quantity 
corresponds to an observable A. (How do we know the form of A? See the 
beginning of 12.8.) For simplicity we suppose that A has a discrete 
spectrum with all multiplicities one. The predictions of what will be 
observed are as follows. 

If Ay, = ay,, then a will be the value of the observable A at the time ¢ 
for the system S in the state with y-function y,. 

In the general case, let y, i=1,2,..., be an orthonormal basis for 
Ks, consisting of eigenvectors for A. We expand y, with respect to this 
basis: Y, = D2 a(n) Wy. Let Ap? = a,b. Then the result of measuring A 
will be a random variable taking the value a, with probability |@(s)|?. (It 
is easy to see that the mathematical expectation of this random variable is 
(Ay,, f,). This formula holds for all A. More generally, the probability of A 
falling in a Borel subset U CR is equal to (P,(U)¥,, f,), where P,(U) 
was defined in 12.5(d).) 

(c) System evolving after observation. With the same assumptions as 
before, the y-function of the system after the observation is determined by 
the result of the observation. If we registered the value a, for A at the time 
f, then, starting from y{) at %, S evolves until the next observation 
completely independently of how it evolved before. 

Thus, the result of the observation lets us know the form of the 
~-function after the observation, but it tells us nothing about the ~-func- 
tion before the observation. Hence, physicists often say that registering the 
value ¥“) prepares the system in the state y{ at the time f). Another 
synonym: at the moment of observation the y-function of the system 
reduces to y. 

If we were able simultaneously to register the values of two observables, 
then we would prepare the system with a ~-function which is an eigenfunc- 
tion for both observables. Since noncommuting observables always have 
different eigenvectors, in general the values of such variables are not 
simultaneously measurable. 


12.9. Quantum logic. We now investigate the algebraic framework of 
quantum logic. We start with the following analogous situation. 

Suppose we are given a formal language in £, having one variable and 
an interpretation of this language in a set M where this variable takes 
values. Then we can distinguish the Boolean algebra B of definable sets in 
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M (see §3). The conjunction of formulas corresponds to the Boolean 
intersection of the sets that define them, and so on. By definition, N € B if 
we can ask in the language, “Does the value of the variable belong to N?”. 
The algebra B is the most important invariant of the pair {language, 
interpretation}. 

We now consider the language of quantum mechanics, oriented on 
describing a system S. We shall exclude the time aspect by fixing a 
moment of time to which all statements about the state of the system refer. 
Then the “state of the system” will be the only variable in the language. It 
takes values in the set of lines in the Hilbert space ‘,. The only questions 
to which we can give a yes or no answer are those of the form: “Does the 
state of the system belong to a given closed subspace of K,?”. It is the 
closed subspaces of SC, which form the analogy of the Boolean algebra B. 
The conjunction of questions corresponds to the intersection of subspaces 
and the disjunction corresponds to their sum, but both operations can only 
be performed when the corresponding projection observables commute. 
Only in this case are the Boolean identities fulfilled. 

We axiomatize the situation as follows: 


12.10. Definition. A partial Boolean algebra is a set B together with the 
following structures on B: 


(a) A reflexive and symmetric binary relation * called “compatible 
measurability.” Instead of (a, b) € * we write a * 5. 

(b) Partial binary operations \/ and (\ and a unary operation ’. 

(c) Two elements 0 and 1€ B. 


These structures must satisfy the following axioms: 


(d) The relation * is closed with respect to the operations /A\, \/, and’: 
if a,, a, and a, are pairwise compatibly measurable, then (a, A 
dy) * a3, (a, \/ ay) * ay, and aj * a3; in addition, a + 0 and a * 1 for 
alla B. 

(e) If a,, a, and a, are pairwise compatibly measurable, then together 
with 0 and | they generate a Boolean algebra relative to the 
operations \/, A, and ’. 


12.11. Example. Let ‘K be a Hilbert space (possibly real and finite 
dimensional). The partial Boolean algebra B() is defined as the set of 
closed subspaces of ‘ with the following structures: 


(a) a* 6 if and only if there exist three pairwise orthogonal closed sub- 
spaces c, d,e € K such that a=c@d and b=e @d. The motivation 
for this definition is that this condition is equivalent to commutativity 
of the projections onto a and b. 

(b) a A 6 = the intersection of a and b. 

(c) a\/ b =the sum of a and b. 
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(d) a’ = the orthogonal complement of a. 
(e) O= {0} and 1= XK. 


One form for the theorem that there are no hidden variables is as 
follows. 


12.12. Theorem. /f dim SC > 3, then B(3C) cannot be imbedded in a Boolean 
algebra in such a way that the operations are preserved. 


This result can be strengthened formally in various ways: see §5 of 
Kochen and Specker, and also N. Fierler, M. Schlessinger, Duke Math. J., 
vol. 32, No. 2 (1965), 251-262. We shall not dwell on this here. 


Proor. We choose a real Euclidean space E*> Cc K and show that even 
B(E?) cannot be imbedded in a Boolean algebra. Otherwise there would 
exist a homomorphism of the partial Boolean algebra B(E*) onto the 
two-element Boolean algebra {0, 1}, since, for any pair of elements in any 
Boolean algebra, there exists a homomorphism onto {0, 1} which separates 
them. 

Let A be such a homomorphism. If a,, a), a, € E> are pairwise orthogo- 
nal lines, then h(a; A a;) = h(a,) A h(a,) = 0 for i #/. Hence, in any pair of 
orthogonal lines, at least one of the pair must go to 0 under A. Further- 
more, h(a, \/ a, \/ a3) = A(a,) V A(a,) V h(a) = h(E) = 1. Hence, in any 
frame exactly one of the lines goes to 1. 

If we map the points of the unit sphere S? onto the lines joining them to 
the origin and then apply A, we obtain a mapping of S* with the property 
in Proposition 12.4 (where we only have to switch the roles of 0 and 1). 
We prove that no such map exists even on a certain subset consisting of 
117 points on S?. The latter stronger result is combinatorially elegant and 
physically meaningful: a physicist might raise objections to asking to be 
able to measure the projection of the spin of orthohelium simultaneously in 
all directions, independently of the question of whether or not hidden 
variables are possible. In fact, we only need finitely many directions to 
show the futility of such an attempted measurement. 

Consider a finite graph. By a realization of the graph on S* we mean 
any imbedding of the set of its vertices in S* for which the distance 
between the endpoints of any edge equals 90°. 


12.13. Lemma. Let a and B be points on S* such that the sine of the angle 
between them €[0, 4]. Then there exists a realization of the following 
graph T, in which ag goes to a and ay goes to B. 
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Proor. Let x,y, Z be an ort on S”. We take a, to x and a, to Z. For certain 
&, 7 ER (to be chosen later), we set 


Yr x+ny 
a PEE aT | ab, © Wee some Ce 
yl+& yl +7? 


Then the images of a, and a, are determined up to a sign by the property 
of being orthogonal to (a,, as) and (a, ag), and we choose 


ayRh 


gy —z nx —Y 


= abe ==: 
ylt+é ylt+y 


ab 


We similarly set 
énx —& +7 es x+ny +énz 
—— 7 a 
yl + £74 £7 % yl +97 + £71? 


and, finally, ag and a, are determined up to a sign. The sine of the angle 
between ay) and a, is easy to compute: it equals 


én/ VU + &? + €7n?)(1 + 9? + £77) 


As € and m vary, this expression takes on all values in [0, 4]. 


O 


12.14. Lemma. Consider the graph 1, which is obtained from Figure \ by 
identifying the vertices a = po, b = qo, and c = ry (the apparent intersec- 
tions of the edges inside the circle are not vertices). This graph is realized 
on S?. 


Proor. For 0< k <4 set 


ok Sain te ys 


Eee 10 


mk > 


a 
Gb Cos 10 y +sin 10 


r, Fe sin = -x +c0S = ‘Zz, 

Since sin(7/10) < x, we can first extend this map to a realization of the 
subgraph between the points po, p, and ry by using the preceding lemma. 
Rotating the resulting realization around ry so as to take (pg, p,) to 
(Py; Pz), (Pz, Py), -.-, We obtain a realization of the “lower arc” and ro. By 
similarly rotating around the images of pp and gp, we obtain a realization 
of the other two arcs as well. Oo 
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Ty 

ae) \ 9 
Ai 

At 


Figure 1. 


12.15. END OF THE PROOF OF PROPOSITION 12.4 AND THEOREM 12.12. 
Consider an arbitrary map k of the vertices of the graph I, to {0, 1}. 
Suppose that exactly one vertex in each triangle goes to | and at least one 
of the two vertices on each edge goes to 0. In the triangle { po, 79, do} 
suppose that py goes to 1. We consider the copy of the graph I’, between 
the vertices po, 79, and p,, which we identify with dp, ag, and dg, respec- 
tively. 

We must have k(p,) = k(a,) = |. In fact, if we had k(a))=0, then we 
would also have k(a;) = 1, and then k(a,) = k(a,) = k(a3) = k(a,4) = 0, and 
k(as) = k(ag) = 1, which is a contradiction. 

We now return to [,. Since k(po) = k(p,) = 1 we similarly find that 
k(p2) = 1, and then k(p3) = k(p4) = k(qo) = 1. But k(go) = 1 contradicts 
the fact that k(po) = 1. This completes the proof. oO 


12.16. Quantum tautologies. This theme has been largely neglected. We give 
a counterexample due to Kochen and Specker and formulate some recent 
results of Gelfand and Ponomarev. 
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(a) Counterexample. This consists of the following: it is possible to give a 
logical polynomial in 117 variables which represents a classical tautology 
but which is defined and takes the value 0 in the partial Boolean algebra 
B(E*) for some values of the variables. This is simply another aspect of 
the impossibility of imbedding B(E*) in a Boolean algebra. 

In fact, let P(p, q, r) be a logical polynomial in three variables which 
takes the truth value 1 when exactly one of |p|, |q|, and |r| is 1. We may 
assume that only the connectives \/, /)\ and — occur in P. Similarly, let 
O(p, q)= ap\/ aq. Then Q takes the value | when at least one of 
|p|, |q| is O. We index the vertices of I’, from | to 117 and set 


R yee = ( P Dp? rls ) 
(p, Puy= 7 as (Djs B; Pr) JN OC Ps) 


The first /\ is taken over all triples {i, j, k} corresponding to triangles in 
T,, and the second /\ is taken over all pairs {r, s} corresponding to edges. 
The argument in 12.15 shows that for any mapping {P),-.--,Pi7}> 
{0, 1} at least one of the Boolean factors takes the value 0. Hence R is a 
classical tautology. 

But if we substitute for p, the line from the origin to the image of the /th 
vertex in a fixed realization of T’,, then we obtain for the value of R the 
element 0 € B(£E°). In fact, if p, and p, are orthogonal, then p;\/ p, = E = 
Similarly, if p, p,, and p, are orthogonal, then P(p;, Pj» Py) = 1 BCE ?), 
The latter assertion is verified as follows: if we set 


atb=(aAb)V (a Nd), 
then we may take 


P(p.gr)=ptaqtr+pAqAr 
(for any arrangement of parentheses on the right), so that 


P (Dis Pjs Py) = Pi: BD; BY, = EP. 


(b) Results of Gelfand and Ponomarev. We start with the following 
observation. The operations /\, \/ and ’ are actually defined everywhere 
on the set B(3C) of closed subspaces of the Hilbert space ‘i, although they 
do not satisfy the Boolean axioms, and, if we ignore the compatible 
measurability relation *, it seems as if they no longer have physical 
meaning. 

Nevertheless, it is also natural to investigate these structures, which 
were first introduced into the logic of quantum mechanics by G. Birkhoff 
and J. von Neumann (Annals of Math. vol. 37 (1936), 823-843). Here is 
how these structures are axiomatized: 


Definition. A modular structure L is a set with binary operations /\ and \/ 
which satisfy the following conditions: 


(a) A and \/ are associative and commutative; 
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(b) a\a=a\/a=a forallacL; 
(c) If aA b=5, then (aVe)Ab=bV(cAb) (the “modular iden- 
tity”). 


Birkhoff and von Neumann also require an “orthogonal complement” 
operation to exist with the usual axioms, but we shall omit this here. 

We note that the modular identity is only fulfilled universally in B(SC) 
if SC is finite dimensional. It is also fulfilled for triples a, b,c whose 
elements have finite dimension or codimension in JC. 

I. M. Gelfand and V. A. Ponomarev (Uspehi mat. nauk, vol. XXIX 
(1974), No. 6 (180), 3-58) have studied the linear representations of free 
modular structures with r generators in B(5C) for finite dimensional spaces 
over arbitrary fields. Such a representation is called indecomposable if it 
does not split into a direct sum of representations in B(K,) ® B(I,). 


Definition. A modular question is an element of a free modular structure 
which takes the value 0 or | for any indecomposable finite dimensional 
representation. 


One of the main results of Gelfand and Ponomarev is the construction 
of a very nontrivial countable series of modular questions. We shall only 
formulate these results here. 

Let L” be a free modular structure with n generators {a,,...,a,}. We 
set ]={1,...,n}. A sequence a =(i,,..., i,) of length / > 1 of elements 
of J is called admissible if it does not have any identical neighboring 
entries. A sequence 8B =(k,,..., k,_,) of length /—1 of elements of J is 
called subordinate to « if it is admissible and if V, < /— 1, k, € {j, i41}- 
For admissible a we inductively define 


a, = 4; ...5, = G, A(V gg) 


where 8 runs through all sequences subordinate to a. Further, for t€ 
{1,..., a} we define 


Al) = Vite 


where a runs through all admissible sequences of length / with last entry t. 
Finally, we set 


H,()= V ier A(?). 


The substructure in L" generated by the elements H,(1),..., H,(1) 
consists entirely of modular questions for all / > 1. 

This is a difficult result. It is relatively easy to prove that this substruc- 
ture is a Boolean algebra consisting of 2” elements. If we substitute the 
elements in this Boolean algebra for the variables in the usual Boolean 
tautologies, we obtain “quantum tautologies,” but to see this we must 
consider structures with complements. 
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It is not yet clear whether this algebra leads to nontrivial physics. 
Perhaps one should combine it with the techniques in the representation 
theory of symmetry groups. 


12.17. The orthohelium atom revisited. In conclusion, we return to the 
orthohelium atom S and show how the material in 12.2 looks from a more 
general vantage point. 

(a) Choice of Ks. As explained in 12.7, an electron without spin 
corresponds to the space L7(E3). If we want to take the spin into account, 
we must introduce a “two-component” w-function, i.e., use the space 
L*(E*)@C?. The system of two electrons in helium is described by 
y-functions in the tensor square of this space. However, by Pauli’s princi- 
ple, the Y-function of this system must behave antisymmetrically when the 
electrons corresponding to the two parts of the tensor square are permuted. 
Hence, we finally obtain SC, = A*(L7(E%) @ C?). 

(b) Choice of Hs. This is a difficult problem, because each electron 
moves in the variable electromagnetic field created by the nucleus and the 
other electron. The principal term in the Hamiltonian corresponds to the 
spherically symmetric constant potential obtained by averaging over time. 
The remainder is treated as a small perturbation. We give the approximate 
form of the y-function of orthohelium, more precisely, of the element in 
A?(L?(E°)) corresponding to the projection of JC, onto the subspace of the 
unit projection of the spin: 


pre Kt (C, + C3 (7, +72) + Caria + Cory (7) + 72)sinh Co (7, — 72) 
+ (ry = r2)(C3 + Cor2)cosh Co (7) — r))| 


where 7, =(3}_,x;)'/7, 6 = 1, 25 ry = (ja (%yj — X))'”, and the con- 
stants k, C\,...,(C, are found experimentally. (E. U. Condon and G. H. 
Shortley, The Theory of Atomic Spectra, Cambridge University Press, 
London, 1935.) 

(c) Approximate symmetries. The group SU(2) acts on the space K,: on 
L?(E3) through the quotient group SO(3), and on C? by the standard 
representation. This is the group of approximate symmetries of the system. 
The yY-function of orthohelium is “not too far” from the subspace corre- 
sponding to a suitable representation of SU(2), so we may speak of the 
principal (7), orbital (/), and other quantum numbers of the state, as in the 
case of a hydrogen atom. 

(d) Spin. The total angular momentum operator $ commutes with the 
Hamiltonian H,. In the state n = 2 and j = 1, its eigenvalue is 2 (in atomic 
units). The eigensubspace N c SC, corresponding to this eigenvalue is 
three-dimensional. Further, the squared spin projection operators 4, 

3, {2 commute in pairs (this is a peculiarity of spin 1). Letting P denote 
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the projection of JC; onto N, we are then able to imbed the partial 
Boolean algebra B(E*) in B(H,) by letting a line a Cc E? correspond to 
the image in JC, of the operator P42. This takes the place of the somewhat 
naive picture in 12.2. 


Appendix: The von Neumann universe 


1. The premises of “naive” Cantorian set theory reduce to the following: a 
set may consist of any distinguishable elements (of the physical or intellec- 
tual world); a set is uniquely determined by its elements, and any property 
determines a set, namely, the set of objects which have this property. 

However, the formal language of set theory L,Set was introduced in 
order to describe a more restricted class of sets (a universe). Part of these 
restrictions come from considerations of convenience, and part come from 
the desire to avoid the so-called paradoxes. This gives an “upper bound” 
for our classes. We give a “lower bound” by asking that the class of sets be 
closed with respect to all mathematical constructions needed for certain 
(ideally, “all’’) parts of intuitive mathematics. 


2. Following Zermelo, von Neumann, and others, we consider two basic 
restrictions on sets. 

(a) All elements of sets must themselves be sets. In particular, since any 
chain Xy»€X,EX,€--- in the von Neumann universe V must 
terminate (see below), it follows that the last element in such a chain must 
be the empty set. Thus, all the sets in V are constructed “from nothing.” 

(b) The assumption that every collection of sets, even sets as in (a), is 
again a set in V, immediately leads to contradictions (Burali—Forti, 
Russell, and others). In particular, the collection of all sets in the universe 
is not itself an element of V. Hence, we must give a sufficiently complete 
description of which operations do not take us outside of V. The two basic 
formal languages of set theory—that of Gédel-Bernays and that of 
Zermelo—Fraenkel—differ in the choice of objects over which the variable 
symbols are to range under the standard interpretation of the language in 
V. In the Zermelo-Fraenkel language (our L,Set), they range over the sets 
in V. In the Gédel-Bernays language, they name classes (collections of sets 
in V) which “are not necessarily sets,” and the property of “being a set” is 
specially defined as the property of “being an element of another class.” 
The Gédel-Bernays language is studied in Chapter 4 of Mendelson’s book. 

In this section we describe the von Neumann universe using the 
customary terminology of intuitive mathematics. The relationship of this 
construction to formalism will be discussed in subsection 18. 


3. The first levels. The von Neumann universe is constructed inductively, 
starting from the empty set, by successively applying the “set of all 
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subsets” or “power set” operation ?. In this way: 
Yo=O, 
= (S) = {}, 
V,=P(V,)={, {}}, 


Vi. = F(V,), 


It is easy to see that V, Cc V,,, (later this will be proved in complete 
generality). The level V,, consists of 


2 
22" (n — 1 twos) 


finite sets, whose elements are also finite sets, and so on. 
We cannot go beyond finite sets unless we regard all the V,, as “already 
constructed” and apply * to the union of the V,. We set 


-Uy, 


n=0 


Vin = OV 5) 


The indices which we now use for the levels are the names of the first 
infinite ordinals. This remarkable idea of transfinite iteration of such 
constructions is due to Cantor, who first applied it to study trigonometric 
series, and then investigated it systematically, finding in it the key to the 
infinite. 

In the next two subsections our sets will temporarily be Cantorian sets. 
We shall return to V after developing some properties of ordinals. 


4. Ordinals. Let X be any set on which we are given a binary relation <. 
We consider the following properties of this relation: 


(a) Y ¢ Y, for all Y EX; if Y, < Y, and Y, < Y3, then Y, < Y3. 
(b) For any Y, Z © X, either Y < Z or Z < Y, or else Y=Z. 
(c) Every nonempty subset of X has a least element (in the sense of <). 


The relation < is a partial ordering of X if it satisfies (a), a linear 
ordering of X if it satisfies (a) and (b), and a well-ordering of X if it satisfies 
all three conditions (a), (b), and (c). 

Let (X, <) be a well-ordering. The initial segment Y determined by an 
element Y € X is the well-ordered set (Z, <), where Z= {Y'|Y’< Y}. As 
is customary when speaking about a well-ordered set, we shall omit the 
explicit indication of the ordering if it is clear from the context. 
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5. Lemma. Let X and Y be two well-ordered sets. Then exactly one of the 
following alternatives holds: 


(a) X and Y are isomorphic. 
(b) X is isomorphic to an initial segment in Y. 
(c) Y is isomorphic to an initial segment in X. 


In each case the isomorphism is uniquely determined. 


Proor. We divide the argument into several steps. 

(a) Let X¥ be well-ordered, and let f: X > X be a monotonic map, Le., 
Z,< Z,=>f(Z,) < f(Z,). Then for all Z © X we have f(Z) > Z. In fact, 
among the elements not having this property there would have to be a least 
element Zp). But f(Z)) < Z) and the monotonicity of f imply that f( f(Z,)) 
<f(Z), so that we would have an even smaller element in the set of 
elements not having the desired property. 

(b) Therefore X is not isomorphic to any of its initial segments x if 
f3 X3X,, then f(X,) < Xj. 

(c) Now let X and Y be well-ordered. We set f= {(X, Y|X, © X, 
Y, € Y, and there exists an isomorphism of ,; with Y,}. First of all, f is 
the graph of a one-to-one mapping of pr, f onto pr, f. In fact, if X, # X,, 
say X, < X,, then by (b) X, is not isomorphic to X,; by symmetry, the 
same holds for f~!. It is also clear from this that f and f—! are monotonic. 
Further, if X, € pr, f and X, < X,, then X, €pr, f, and similarly for pr, f. 
Finally, we show that either pr, f= X, or else pr, f= Y. Otherwise, there 
would exist a minimal element Xj in X \pr, f and a minimal element Y, in 
Y \pry f. But, by the preceding paragraph, f induces an isomorphism of xX 1 
with Y,. By the definition of f, we then have <X,, Y,> € f, a contradiction. 

(d) All of this means that either f is an isomorphism (more precisely, the 
graph of an isomorphism) of the set X onto Y or an initial segment in Y, or 
else f~' is an isomorphism of Y onto X or an initial segment of X. It is 
clear from the definition of f that the graph of any other isomorphism must 
be contained in the graph of f, so we have uniqueness. The lemma is 
proved. oO 


As a preliminary definition, we can now consider the class of all 
well-ordered sets isomorphic to some fixed totally ordered set X, and call 
that class an ordinal. Two ordinals a and £ satisfy the relation a= B, 
a < 8, or a > B depending on which of the alternatives in Lemma 5 holds 
for representatives X Ea and Y € B (this obviously does not depend on 
the choice of representatives). 

The next step is, naturally, to consider “all” ordinals as a class and show 
that < induces a well-ordering on this class, thereby giving a universal 
well-ordering. However, an unnecessary difficulty arises here: the class of 
well-ordered sets isomorphic to a fixed X is extremely large, and so the 
class of ordinals must be a “class of classes,” which needlessly complicates 
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matters. An elegant technical discovery, due to von Neumann, removes 
this difficulty: instead of a vast number of possible orderings imposed on 
X from outside, we consider a single relation given by internal properties. 
Recall that a set X is transitive if Z © X whenever Z € Y € X for some Y. 


6. Definition. An ordinal is a transitive set X of sets which is well-ordered 
by the relation € between its elements. 


7, Theorem. 
(a) The class of ordinals On is well-ordered by the relation a © B 
(which we shall also write a < B). 
(b) Any well-ordered set is isomorphic to a unique ordinal a, and also to 
a unique initial segment of ordinals (those less than aU {a}). 


PROOF. 

(a) We must verify conditions (a), (b), and (c) of subsection 4. The first 
of them follows immediately from the definition. 

To prove the second condition, we consider two ordinals a and £. By 
Lemma 5, there exists an isomorphism f of one of them, say a, onto either 
B or an initial segment of 8. We show that then a = 8 or a € B. To do this, 
we prove that f(y) = y for all y € a. In fact, if y, is the minimal element 
with f(y,) # y,, then f(y) = f(y) for all y, € y,. Since f is an isomorphic 
imbedding of a with respect to the ordering €, and since y, and f(y,) are 
sets, we have f(y,) = {f(y2)l¥2 © yi} = (¥2l¥2 © 1} = ¥, Which con- 
tradicts the choice of y,. The same argument shows that f(a) =a, from 
which the condition follows. 

Finally, let C be a nonempty class of ordinals, and let a € C. If @ is not 
the least element in C, then the least element in the intersection aM C will 
be the least element in C. 

(b) Let X be a well-ordered set. Let S denote the set of ordinals which 
are isomorphic to some initial segment in X. S is nonempty, since, for 
example, the ordinal {@} is isomorphic to the segment consisting of the 
least element of X. It is easy to see that the set B= U,.,a is an ordinal. 
We claim f is isomorphic to X. In fact, if this were not the case, then B 
would be isomorphic to an initial segment in X, say X,. But then the 
ordinals 8 U { 8}, which is larger than 8, would be isomorphic to the 
initial segment X, U {X,}, contradicting the definition of B. O 


We now give the elementary properties of ordinals. 


8. (a) The finite ordinals are the “natural numbers” (and zero) in the first 
levels of the universe V. Thus, we shall write: 


0=9, 1={@}, 2={8,{O}}, 3={0, {O}, {G, {O}}},...- 
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(b) The ordinal which immediately follows a given a is a U {a}. It is 
also denoted a + 1, which agrees with the notation in (a) in the case of 
finite a. 

(c) An ordinal £ is called a limit ordinal if B 4 @ and B #a + | for any 
a. The first limit ordinal w, is isomorphic as a totally ordered set to 
{0, 1, 2,3,...,,...}. If @ is a limit ordinal, then a= ges B. The 
converse is also true. 

Ordinals are mainly used for three purposes: proofs using (transfinite) 
induction, constructions using (transfinite) recursion, and measuring cardi- 
nalities. Here are the basic principles. 


9. Transfinite induction. Let C be a class of ordinals for which 


(a) GEC. 
(b) If aE C, thnat+1EC. 
(c) If a set of ordinals {a,} is contained as a subset in C, then Ua, EC. 


Then C contains all ordinals. 


In fact, otherwise there would exist a least ordinal not in C, but this 
could not be the empty set by (a), a limit ordinal by (c), or any other 
ordinal by (b). In concrete applications, the verification of (a) and (c) are 
often trivial and are omitted. 


10. Transfinite recursion. Let G be a function of sets (it will actually be 
sufficient to assume that G is defined on all sets in the universe) whose 
values are sets. Then there exists a unique function F on the ordinals such 
that 


F(a) = G (the set of values of F on the elements of a). 


In fact, this equality uniquely determines F(0) = G(@), and then F(1) 
= G({ F(0)}), F(2) = G({ F(0), F(1)}), and so on. Thus, if we consider 
the class C of ordinals a for which we can define F with the required 
property on the initial segment of ordinals <a, then C satisfies the 
conditions 9(a)—(c), and therefore contains all the ordinals. Uniqueness 
follows similarly (if F # F’, consider the least a with F(a) # F’(a)). 


11. Measuring cardinalities. Different ordinals can have the same cardinal- 
ity. For example, all the ordinals wo, wy) + 1, w9+2,... (and many more 
after them!) are countable. However, jumps in cardinality occur arbitrarily 
far out. 

An ordinal which does not have the same cardinality as any lower ordinal 
is called a cardinal. All finite ordinals and w, are cardinals. Clearly, any 
infinite cardinal is a limit ordinal. Further, any set has the same cardinality 
as some cardinal, and, in fact, a unique one (see §1 of Chapter III). The 
infinite cardinals form a totally ordered class, which is naturally indexed 
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by ordinals. Thus: 


Wo = the first countable ordinal ; 
w, = the first ordinal of cardinality > wo 

= the set of all finite and countable ordinals; 
w, = the first ordinal of cardinality > w, 

= the set of all ordinals of cardinality < w,, 


and so on. 
We can now give our fundamental definition. 


12. Definition. The (von Neumann) universe V is the class of sets 
U.conY%a Where the set V, is defined by the following transfinite 
recursion: 


Vi=@ 
Vist = P(V,) 
V,=U er Ves if a is a limit ordinal. 


We give some elementary properties of the universe V. 


13. Each of the sets V, is transitive: if Y EX EV, then Y € V,. Un other 
words, V, C V4.4.) 

Suppose that this were not true. Then there would exist a least ordinal a 
with V, 2 V,,,, where a >2. If a is not a limiting ordinal, a= 6+ 1, 
YEXEV,, and Y €V,, then we obtain a contradiction as follows: 
X EVg4, = P(V—)>X CVa>Y EV,>Y E Ve, = V,, since for B it 
is still true that Vz C Vg,, by our choice of a. If @ is a limit ordinal, the 
argument is analogous (find y < a with YE X EV, and Y & V,). oO 


We define the rank of any set X € V as follows: rank X = a if a is the 
least ordinal such that X © V,,,. If Y © X then, rank X > rank Y +1. 


14. All ordinals belong to V, and rank a= «a. 

We first show that a € V,,, for all ordinals a. This is true for a= 0. 
Suppose that a is the least ordinal with a¢ V,,,. If a=8+1, then 
B &V,,,, so that B and {8} EV,,,= P(Vg4,), and hence a= B+1= 
BU{B)} © Vg4. = V4), 2 contradiction. On the other hand, if a is a limit 
ordinal, then a= U,,, B and B € Vg, C V, by the choice of a, so that 
a= Ug, BC UpgegVe = Va, and aE P(V,) = V,41, a contradiction. 
Therefore, rank « < a. We similarly prove strict equality. oO 


15. The universe V is closed with respect to the standard set operations: 
difference, union, intersection, forming ?(X) and Uy <,Y, and “collect- 
ing” sets indexed by any set: {X,|Y € Z}. In particular, if X, Y € V,, 
then the pair {X, Y} © V,,,. We write {X} in place of {X, X}. 
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16. Direct products, relations, and functions can also be defined as 
elements of V using a device of Kuratowski. The intuitive notion of an 
ordered pair of sets X, Y € V is realized by means of the set 


CX, Y=({X}, (XV EV. 


As elements of V, ordered pairs are characterized by the following proper- 
ties: an ordered pair is set of two elements X’ and Y’, one of which is a 
subset of the other (say X’ Cc Y’); if X’ Cc Y’, then X’ = {X} is a one-ele- 
ment set, and X is called the first term of the pair; Y’ is a set of at most two 
elements, and its element Y which is different from X (if it exists) or X 
itself (otherwise) is called the second term of the pair. Thus, <X, Y>= 
«X”, Y”> if and only if ¥ =X” and Y= Y”, which justifies the name 
“ordered pair.” 

We emphasize that this definition is introduced so that the direct 
product construction does not leave the universe V, and so that a set 
corresponding to a direct product can be described in terms of the relation 
€, 1.e., in the language L,Set. 

An ordered n-tuple of sets is defined as 


(Xin May Co KX XQ), XZ) Ds 
We define the direct product of two sets as 
XXY={{U,W)|UEX,WEY}. 
Similarly, 


XX ++) XX, =(-- + (XXX) x X3)X---?). 


We note that, in general, (X x Y)X Z#X X(Y X Z); we only have a 
canonical one-to-one correspondence between these two sets. But it is 
usually harmless to take the liberty of identifying the two sets and writing 
XXYXZ. 

A binary relation (or correspondence) r is a set (or class) all of whose 
elements are ordered pairs. If rE V is a relation, then its domain of 
definition dom(r) is the class of all first terms in the elements of r, and the 
range of values mg(r) is the class of all second terms. 

A function is a binary relation in which each element is uniquely 
determined by its first term. Thus, functions which are maps of sets in V 
are identified with their graphs. If f is a function, we often write W = f(U) 
instead of <U, W> & f. In addition, we set 


f'(X)={YIF(Y) EX}, 
fly ={<U, WY EflU EX}. 


A family {X,|Y © Z} as an element of V is defined to be a function 
consisting of pairs {< Y, X,>| Y € Z}, and so on. 

We again emphasize that the most important feature of these definitions 
is that we do not introduce any new objects besides elements of V, or any 
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new relations other than those expressible in terms of €. It should also be 
noted that, in accordance with the usual (“extensional”) notion, a property 
of the elements of a set X € V is a subset Y Cc X (consisting of all elements 
with this property). Thus, Y € V, so that properties, properties of proper- 
ties, properties of sets of properties,... (with transfinite iteration) are 
elements of V. 

The “universe” V has earned its name. 


17. Finally, we show that a chain ¥, © X,€--- of elements of V must 
terminate (of course, with the empty set). 

We prove that, if X is nonempty, then there exists a YE X with 
Y 1 X =@ (the desired result is obtained if we apply this to the set X of 
terms in the chain). In fact, let Y be the element of least rank in X (which 
exists because the ranks, since they are ordinals, are well-ordered). If we 
had X 9 Y#¥@, then any element Z € X¥ M Y would have lower rank 
than Y, a contradiction. O 


18. Connection with the axioms of L,Set. The point of view adopted in this 
book is as follows. 

The intuitive notion of a set, to which we appealed when constructing 
the universe V, is the primary material. The language LSet was devised in 
order to write formal texts based on this material which are equivalent to 
our intuitive arguments concerning V. The axioms of L,Set (including the 
logical axioms) are obtained as a result of analyzing intuitive proofs. Our 
criterion for the completeness of this list is that we can write a formal 
deduction which translates any intuitive proof. The fact that we are able to 
do this must be proved by a rather large compendium of formal texts, 
which can be found in other books on logic. In particular, in L,Set we can 
write the formula “Wx dordinal a (x € V,)” and deduce it from the 
axioms. This formula is the formal expression of our restriction to sets in 
V. 

The question of the formal consistency of the Zermelo—Fraenkel axioms 
must remain a matter of faith, unless and until a formal inconsistency is 
demonstrated. So far all the proofs which have been based on these axioms 
have never led to a contradiction; rather, they have opened up before us 
the rich world of classical and modern mathematics. This world has a 
certain reality and life of its own, which little depends on the formalisms 
called upon to describe it. 

The discovery of a contradiction in any of various formalisms, even if it 
should occur, would merely serve to clarify, refine, and perhaps recon- 
struct certain of our ideas, but would not lead to their downfall, as has 
happened several times in the past. 
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The continuum problem and forcing 


1 The problem: results, ideas 


1.1. Cantor introduced two fundamental ideas in the theory of infinite sets: 
he discovered (or invented?) the scale of cardinalities of infinite sets, and 
gave a proof that this scale is unbounded. We recall that two sets M and N 
are said to have the same cardinality (card M =card N) if there exists a 
one-to-one correspondence between them. We write card M < card N if 
M has the same cardinality as a subset of N. We say that M and N 
are comparable if either card M < card N or card N < card M. We write 
card M > card N if card M > card N but M and N do not have the same 
cardinality. 


1.2. Theorem (Cantor, Schréder, Bernstein, Zermelo) 

(a) Any two sets are comparable. If both card M < card N and card 
N < card M, then card M =card N. In other words, the cardinalities are 
linearly ordered. 

(b) Let P(M) be the set of all subsets of M. Then card ?(M) > card 
M. In particular, there does not exist a largest cardinality. 

(c) In any class of cardinalities there is a least cardinality. In other 
words, the cardinalities are well-ordered. 


PROOF. 

(a) Suppose M has the same cardinality as the subset M’ Cc N and N has 
the same cardinality as the subset N, C M = M’. We identify M with M’. 
We then have three sets N, C M CN and a one-to-one correspondence 
f:N—>N,. We must construct a one-to-one correspondence g : N>M. 
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Here is an explicit definition of such a map: 


g(x) = | f(x), if x € f"(N)\f" (M) for some n > 0, 
x, otherwise. 


Here f(y) =ACf(- = f(y) +++ )) (nr times); f"(N) = ("Ly EN}, 
and f(y) =. We leave the echction that g has the required properties 
to the reader. 

To prove that any two sets are comparable, it is sufficient to show that 
any set can be well-ordered, since Lemma 5 of the Appendix to Chapter II 
implies that well-ordered sets are comparable to each other. Let M be any 
set. For every nonempty subset N Cc M choose an element c(N) € N. We 
call a well-ordering < of a subset M’ C M admissible (with respect to c) if 
c(M\X) =X for all X © M’, where ¥ = {Y|YeEM,Y¥<X}. 

We claim that, if M’+ M” are two subsets of M having admissible 
well-orderings, then one set is an initial segment of the other, and the 
orderings are compatible. In fact, as in subsection 7 (a) of the Appendix to 
Chapter II, we prove that the canonical isomorphism f of, say, M’ with an 
initial segment of M” is the identity imbedding: if f(X)# X and _X is the 
least element with this property, then 


F(R) =X, X = c(M\X )>X = c(M\f(¥)) =f(X), 


which is a contradiction. 

It is now easy to see that the union M’ of all subsets of M which have a 
well-ordering admissible with respect to c itself has an admissible ordering; 
moreover, M’ coincides with M, since otherwise we could imbed M’ in 
M’'U {c(M\M’)}. 

In particular, it follows that any set has the same cardinality as some 
ordinal, and hence the same cardinality as a unique cardinal. This justifies 
the use of the term “cardinality” and the use of cardinals as our standard 
scale of cardinalities (see subsection 11 of the Appendix to Chapter II). 

(b) Since  (M) contains all the one-element subsets of M, we have card 
#(M) > card M. In addition, any map f : M—(M) cannot be one-to- 
one (or even onto). In fact, we set 


= {z|z €f(z)} E P(M), 
and show that N is not contained in the image of f. If there existed an 
n€M such that N = f(n), we would immediately obtain a contradiction 
by considering the relationship of n to N: 
nEN=an€f(n)>n€QN_ by the definition of N; 
n¢N=>nEf(n)>nEN _ by the definition of N. 


This is Cantor’s famous “diagonal process.” 

(c) The well-ordering of the cardinals is established at the same time as 
their comparability in the first stage of the theory of ordinals (see the 
Appendix to Chapter IT). oO 
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1.3. Remark. This proof of the lemma that any set can be well-ordered is 
essentially due to Zermelo. It was probably what prompted the most severe 
objections to the axiom of choice. The intuitive idea behind the proof 
reduces to a recipe for choosing one element after another from the set M 
until all of M is exhausted. In this form it is immediately apparent that the 
prescription is “physically” unthinkable, and to many of Zermelo’s con- 
temporaries the whole proof seemed to be nothing but a trick. For 
example, the idea of “first” choosing an element c(N) in each subset 
N CM met with the following objection of Lebesgue. If the elements we 
choose are not characterized by any special properties, how do we know 
that we are always thinking about the same elements throughout the proof? 
But today, except for specialists in the foundations of mathematics, hardly 
any working mathematicians share these doubts. 

We now formulate the basic problem that will concern us during the 
next two chapters. We shall write card ?(M) = 2°" ™, in analogy to the 
finite case. The continuum is 2”. 


1.4. The continuum problem. What place does the continuum occupy on the 
scale of cardinalities? 

By Theorem 1.2(b), we have 2°° > wo. Hence, in any case, 2“° > w). On 
the other hand, if 2°° > w,, 2°° >w,,...,2° >w,,... for any n, then we 
would have 2“ > w,,, since the continuum cannot be a union of countably 
many subsets of lower cardinality (K6nig). 


1.5. The Continuum Hypothesis (CH). 2° = w,. 

The Generalized Continuum Hypothesis asserts that 24 “ comes 
immediately after card M for any infinite M. Here is almost everything we 
know about this question: 


1.6. Theorem 
(a) The negation of the Continuum Hypothesis cannot be deduced from 
the other axioms of set theory, if those axioms are consistent (Gédel). 
(b) The Continuum Hypothesis cannot be deduced from the other 
axioms of set theory, if those axioms are consistent (Cohen). 


The same holds true for the Generalized Continuum Hypothesis. 

If we grant that the axioms of set theory and the logical means of 
expression and deduction in L,Set, which are implicit in the statement of 
Theorem 1.6, actually exhaust the apparatus for constructing proofs in 
modern mathematics, then we can say that the continuum problem is the 
only known example of an absolutely undecidable problem. Although 
Gédel’s incompleteness theorem provides concrete examples of undecid- 
able propositions in any formal system having reasonable properties, these 
examples can be decided in an “obvious” way in some higher system. The 
Situation with the continuum problem seems much more difficult. If we 
agree that it is a meaningful question, then it can only be decided by 
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introducing a new principle of proof. Various possibilities for doing this 
have been discussed, but none of the suggested new axioms for set theory 
seem sufficiently convincing or, more important, sufficiently useful in 
“real” mathematics. In the hundred years since the introduction of trans- 
finite induction, not a single new method of constructing sets has come 
into common use. Incidentally, the basic idea in Gédel’s proof of Theorem 
1.6(a) actually consists in verifying that all the old methods allow us to 
construct at most w, subsets of wo (or, equivalently, at most w, real 
numbers). 


1.7. Gédel’s idea. G6del considers the basic set-theoretic operations—form- 
ing pairs, products, complements, sums, and so on—and constructs the 
class of all sets which are obtained by transfinite iteration of these 
operations, starting from @. Such sets are called constructible sets. It is a 
priori completely unclear whether or not all subsets of {0, 1, 2,...} are 
constructible, or, more generally, whether or not all sets in the universe V 
are constructible. (It turns out that this problem is formally undecidable to 
the same extent as the continuum problem.) But we find that, within the 
class of constructible sets, the number of subsets of {0, 1, 2,... } is equal 
to w,—most likely because we have omitted a vast number of noncon- 
structible sets. Meanwhile, all the axioms of set theory, restricted to this 
class, are true (in a reasonable meaning of “true”), as are all deductions 
from these axioms. Hence the negation of the CH is not deducible, since it 
is false in this model. The next chapter will be devoted to Gédel’s theorem. 


1.8. Cohen’s idea. We shall present this idea in the version due to Scott and 
Solovay. First we give its application to a certain simplified problem, 
concerned with a language weaker than L,Set; then in §§4-8 we present 
the application to L,Set. For another version of Cohen’s idea, see §9. 

We shall discuss the CH in the form: there does not exist a subset of the 
real numbers R whose cardinality is strictly between that of {0, 1, 2,...} 
and that of R. In fact, if we had 2°°>w,, then any subset of R of 
cardinality w, would have such an intermediate cardinality. 

In order to show that this assertion is not deducible, which is equivalent 
to Cohen’s theorem, it suffices to construct a model of the real numbers in 
which all the axioms and all propositions deducible from them are fulfilled 
and in which a set of intermediate cardinality exists. This model will be the 
set R of random variables on a very big probability space Q. For a suitable 
choice of 2, R will be so big that within the model there exists a set of 
intermediate cardinality, containing N (the integers of the model) and 
contained in R (the continuum of the model). 

Of course, it cannot be quite this simple; there must be some obstacle to 
carrying out this program. The obstacle is that almost all the properties of 
R, including most of the axioms, turn out to be false for R, so that R 
cannot be a model for R in the usual sense of the word. Cohen’s basic idea 
was to develop a method for overcoming this difficulty. He replaced the 
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property of an assertion being true by another property, which we shall 
temporarily call “truth” in quotes, and which has the necessary formal 
properties. Namely, all the axioms of R are “true” in R, all deductions 
from “true” assertions using the rules of logic again lead to “true” 
assertions, and the CH is not “true,” and hence is not deducible from the 
axioms. We now show in greater detail how this is done. 


1.9. Let J be a set of cardinality > w,. We set 
Q=[0, 1]/ , with Lebesgue measure, 


R = the set of random variables on Q 


= the set of measurable real-valued functions on 2. 


1.10. Theorem 
(a) All the axioms of the real numbers and all deductions from them are 
“true” for R. _ 
(b) The CH is not “true” for R. 
Here we say that an assertion P about random variables x, y,--- © R is 
“true” if the following condition is fulfilled: 


for each point w€Q we consider the values x(w),y(w),--- of the random 
variables x,y,-:- and form the assertion P,, about these ordinary real 
numbers; then for almost all wEQ (ie., all but a set of measure 0) P,, is 
true in the usual sense of the word. 


Briefly, “truth” means experimental truth with probability one. 


EXAMPLE. Let P be the assertion that “R has no zero divisors,” 1.e., “if 
x,y ER are such that xy =0, then either x =0 or y=0.” Then the 
assertion “R has no zero divisors” is, of course, not true. However, it is 
“true” because: if x, y © R are such that xy = 0, then for almost all w EQ 
either x(w) = 0 or y(w) = 0. 


1.11. In order to give a precise meaning to the definition of “truth” and 
learn how to verify effectively the “truth” of rather complicated assertions, 
we must introduce a formal language, in this case the language of real 
numbers. This formal language is a mathematical object, and the precise 
formulation of Theorem 1.10 will concern this object, and not R or R at 
all. 

The connection between this language and R is given by a system of 
informal recipes which tell how to translate the usual intuitive texts about 
R into this language, and by a system of theorems which tell us that the 
translation is always possible and that the recipes are faithful to the 
informal texts. The role of R is reduced to that of auxiliary construction 
which is used to define and compute a special “truth” function on the 
formulas of the language. Thus we see the role of logic in the program. 
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1.12. A detailed proof of Theorem 1.10 would be rather lengthy and 
nontrivial for several reasons. In the first place, a certain amount of space 
must be devoted to describing the formal language and the axioms of R in 
this language. We must then verify that all the axioms are “true” and that 
the CH is not “true”—this amounts to one or two dozen verifications, each 
of which involves an inductive argument with infinite sums and products 
in the Boolean algebra of measurable sets in 2. However, the most serious 
difficulties arise because the meaning of every assertion changes consider- 
ably in going from R to R, and not always in a convenient direction. We 
shall illustrate this qualitative aspect by attempting to explain why the CH 
is not “true,” and why this is nontrivial. 

As we have said, we want to construct a subset M of R having 
cardinality intermediate between the cardinality of N and the cardinality 
of R. We do this as follows: for any i€ J, let the random variable 
x, : [0, 1]/-[0, 7] be the ith projection. Choose a subset § CJ such that 
wW, < card § <card J (this is possible if J is large), and set 


M={xJiEI} CR. 

Then card N<card M <card R is true in the usual meaning of the 
word. However, we must show that the corresponding assertion is “true” in 
our Pickwickian sense. But then the role of the integers is assumed by the 
“locally integral” random variables (whose values are integral with proba- 
bility one), and these random variables can have cardinality much greater 


than w,. Thus, the required lower estimate for card M becomes much more 
serious. Similarly, if we formalize our naive description of M and then 


interpret it in R, then M takes on a new meaning, and leads to a much 
larger set than the “real” M. Thus, it is also unclear that the upper 
inequality for card M still holds. It seems almost miraculous that every- 
thing eventually falls into place. 

The plan for the rest of the chapter is as follows. In §2 and §3 we give a 
(shortened) exposition for the second-order language of real numbers of 
this abbreviated version of the theorem that the CH is not deducible. If the 
reader is only interested in the complete proof for L,Set, he may skip to §4, 
where we introduce the Boolean-valued “universe of random sets,” which 
takes the place of V. In §§5-7 we verify that the Zermelo—Fraenkel axioms 
are “true,” and in §8 we verify that the CH is “false.” Finally, in §9 we 
discuss Cohen’s original method, which is more syntactic and involves 
somewhat different intuitive ideas. 


2 A language of real analysis 


2.1. In this section we describe a formal language based on the theory of 
real numbers. In particular, this means that the variables x, y, z will be 
considered as names of real numbers. However, if we try to use a 
first-order language to formulate the assertions we are interested in, such 
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as the Continuum Hypothesis CH, or even the completeness axiom (which 
differentiates the real numbers from the rational numbers), we find that we 
are not able to do this. In fact, in these assertions we need to refer to 
arbitrary subsets (or relations of degree one) of the real numbers, whereas 
first-order languages do not have symbols for variable relations (compare 
with subsection 3.17 of Chapter I). 

This leads us to consider the second-order language L,Real, which is the 
most economical language in which the axioms and the CH can be 
expressed. We shall give a brief description of this language, for the most 
part only noting those features which show the connections with the real 
numbers and those which are peculiar to second-order languages. 


2.2. The language L,Real. The alphabet consists of the variable symbols 
X,Y, Z,...;3 the symbols for degree 1 functions f, g, h, ...; the constants 
0 and 1; the degree 2 operations + and -; the degree 2 relations = and 
<; and the same connectives, quantifiers, and parentheses as in languages 
of &,. The terms are x,y, z,... and 0 and 1; and also f(d), t,-t,, and 
t, + 4, if fis a function symbol and ¢, ¢,, and 7, are terms. The terms are 
names of real numbers. 

The atomic formulas are t, = t, or t; < tj, where ¢, and ft, are terms. The 
set of formulas is defined inductively exactly as in languages of £,, with 
one addition: Vf(Q) and 4f(Q) are formulas if Q is a formula and f is the 
symbol for a variable function. The notions of a free occurrence of a 
variable (x or f), of a closed formula, and so on carry over to L,Real in the 
obvious way. We shall use the same type of abbreviated notation here as in 
Chapter I. The standard interpretation of formulas which is implicit in the 
language should be obvious from the definitions and from the following 
examples. 


2.3. The formula Z(y): “y is an integer.” It is perhaps not completely 
obvious how to write this formula. We can write, “y can be obtained from 
0 by repeatedly adding or subtracting 1,” or else “any function f which has 
period | and vanishes at 0 must also vanish at y,” Le., 


Z(y): WE((FO) =OAVx(F(x) = f(x + 1) >F(») = 9). 


2.4. The formula CH: “Any subset of R either has the same cardinality as R, 
or else is countable or finite.” 

We first restate the formula in different words: “Given a set of zeros of 
any function A, either there exists a function g mapping it onto all R, or 
else there exists a function f mapping the integers onto all of this set.”” We 
then have: 


CH: wh(Ag Vy Ax(A(x) = 
OAy =8(x))V Af Wy(h(y) = 0 3x(Z(x) Ay = f(x)))). 
Notice that the formula Z (x) occurs as part of the CH. 
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We further write the completeness axiom C: 


2.5. The formula C: “Any subset of R (the set of values of a function f) which 
is bounded from above has a least upper bound z.” We write: 


C: Wf(Ay Vx(f(x) < y) 32 Wy(Wx( f(x) < y)ez < y)). 


All the other formulas we are interested in are simpler and do not require 
any special comment. 

We now give a precise definition of the property of “truth” for closed 
formulas in L,Real; this property was described informally in §1. We 
emphasize that it is not an absolute property, but rather depends on the 
choice of the probability space Q which is used to construct the “model” of 
the real numbers. 


2.6. The algebra of truth values. As in §1, we set 
T=a set; 
Q= [0, 1}' with Lebesgue measure ; 
B = the algebra of measurable sets in 2 modulo sets of measure zero; 


0 = the class of the empty set in B; 
1 = the class of 2 in B. 


We have the following operations in B: 


, 


a’, the “complement” of the element a € B; 
a/\b, the “intersection” of two elements a, b € B; 


a\/b, the “union” of two elements a, b € B. 


These operations satisfy the usual identities and give a Boolean algebra 
structure on B. We writea< bifa/b=a. 

Moreover, the operations of intersection and union extend uniquely to 
infinite families of elements, and continue to satisfy the usual identities 
which hold in the algebra of all subsets of any given set. We shall omit the 
verification of all this. We only note that sets here are identified “modulo 
sets of measure zero,” and that identities of the type (A mod 0)/) 
(B mod 0) =(4 1m B) mod 0 do not carry over to infinite families. 

Finally, B satisfies the following countable chain condition: if a, /\ ag = 0 
for all distinct indices a and $8 then a, #0 for at most countably many 
indices a. This follows because Lebesgue measure is positive and additive. 
Technically speaking, B is a complete Boolean algebra with the countable 
chain condition. The precise origin of B and the fact that it has a measure 
play a less important role. 


2.7. The interpretation set. We now introduce a large set M, each point € of 
which corresponds to the assignment of certain values to all the symbols in 
the alphabet of L,Real. If € is fixed, each formula becomes a concrete 
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statement about measurable functions (random variables) on Q and about 
functionals on them (compare with §2 of Chapter II). 
More precisely, we set 


R = the set of measurable real-valued functions on Q; 
R“ = the set of all possible maps f : R= R which satisfy the condition: 
Wx,yER 


(the set {w EQIX(w) = 7(w)} < {w EOI f (X)(o) =f (F)()} mod 0). 


The definition of R“ has the following intuitive meaning. If we ignore the 
“mod 0,” the condition simply means that the value of the random 
variable f(x) at each trial (each point in 2) must be determined by the 
value of x at this trial. Of course, this is a very natural requirement if we 
want functions f to be adequate reflections of properties of ordinary 
real-valued functions in the sense of §1. The addition of “mod 0” weakens 
this requirement by saying “with conditional probability one.” 


We now return to the set M. A point £ € M consists of a choice of 


xR, _ for each variable symbol x; 


f£ ER, for each symbol f for a variable function. 


Here is the interpretation of the expressions in the language which corre- 
sponds to a given choice of &: a io 

(a) Terms. Let t be a term, and let £€ M. Then ¢ € R is the random 
variable which is defined inductively in the obvious way. 

(b) The truth function || || on atomic formulas. Let_P be the atomic 
formula f, < f, or ft, = f,. Its truth value at a point € € M is the element of 
the algebra B which is defined as follows: 


IIt1 < HII) = (w EQ\4f (w) < §(w)} mod 0, 


and similarly for t, = t. 

(c) The truth function || P|\(§) in the general case. The general definition 
proceeds by induction. The rules when formulas are joined by connectives 
are the same as in subsection 5.7 of Chapter II: 


| TP I] = WP I, 
PV QI = IPIV NIL 
IPA QI = PIAL 
|P> QI =P I Vie I, 
IPS Ol = (PIANO VPI Ale). 
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Here, for brevity, we have omitted the €. Finally, 


_ : (over all & which only differ 
xP |(E) = /\ PIE) from é by a variation of x); 
\|AxP |\(€) = y | P(E) (over the same £’); 


and similarly when we quantify over variable functions. Intuitively, the 
value of the truth function of an assertion about random variables is the 
set of trials mod 0 for which this assertion becomes true as a fact about 
real numbers. 


2.8. Lemma. Jf P is a closed formula, then || P\\(§) does not depend on the 
choice of £€ M and only takes the value 0 or 1. 


This is proved by a simple induction on the length of P. It is just as easy 
to prove a more general fact: if P is any formula and £ and ¢’ do not differ 
on variables which occur freely in P, then || P||(€ = ||P ||(€). Compare with 
Proposition 2.10 in Chapter II. 

This value of ||P||(€) which is common for all € if P is closed can be 
denoted simply ||P||. We are now ready to formulate the basic definition 
of this section: 


2.9. Definition. A formula P in L,Real is said to be “true” if || P||(€) = 1 for 
all EM. 


3 The Continuum Hypothesis is not deducible 
in L,Real 


3.1. Fundamental Lemma 


(a) “Truth” is preserved under the rules of deduction. 

(b) The first-order logical axioms and the versions of them in L,Real are 
“true.” 

(c) The special axioms of L,Real are “true.” 

(d) The CH is not “true” if card I >. 


This lemma implies 
3.2. Theorem. The CH is not deducible from the axioms in L,Real. 


In this section we give those parts of the proof of the Fundamental 
lemma which are also essential for the “real” Cohen theorem, as well as for 
our simplified problem. We note that Theorem 3.2 is weaker than Cohen’s 
theorem because the language L,Real contains fewer means of expression 
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than the language of set theory. Although the Continuum Hypothesis can - 
be stated in L,Real, because of Gidel’s general results we have no basis for 
expecting, even if the CH were deducible, that the proof could also be 
given in this language. For example, the deduction could require us to 
introduce functionals of functions, functionals of functionals, and so on. 
The language of set theory, which we shall return to in §4, contains the 
means for considering all of these finite and even transfinite levels at once. 


3.3. PROOF OF 3.1(a). If ||Pj|=1 and ||P>Q||=1, then ||P||’=0 and 
PII’ V || Q |, = 1, so that ||Q|| = 1. Secondly, if ||P|| = 1, then || P| =1 
for all £€ M; but then (here é’ runs through all variations of € along x) 


I|WxP ||(€) = /N Pie) = /N 1=1. O 


We similarly prove this for Gen over functions. 


3.4. PROOF OF 3.1(b) (SKETCH). 

Tautologies. Their “truth” is proved in §5 of Chapter II. 

Quantifier axioms. The proof proceeds by induction on the length of the 
formulas in the axiom schemes. Since it is completely straightforward, we 
shall omit it. 


3.5. PROOF OF 3.1(c) (SKETCH). We shall list the axioms and make some 
brief comments. 

The special axioms of set theory: The axioms of equality and the axiom 
(schema) of choice 


AC: Wx ayP(x, y) > 3f Vx P(x, f(x)), 


where P is any formula which does not have any free variables except x 
and y, and where f is free for y in P. 

The special axioms of field theory: The axioms of the additive group, the 
axioms of the multiplicative group, and the distributivity of addition with 
respect to multiplication. 

The special order axioms: 


xayVyK<x, 

(x < yAy < x)ex=y, 
x< yo(xt+z2<ytz), 
(x < yA0< z)Sxz < yz. 


The completeness axiom (see 2.5). 

Among these axioms, the greatest effort is needed to verify that the 
axiom of choice and the completeness axiom are “true.” But these com- 
putations resemble those in the proof that the CH is false, which will be 
given in detail below. Hence, the verification of these two axioms will be 
omitted. 
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The first axiom of equality is trivial. The second axiom is first verified 
for atomic formulas P, and then we use induction on the length of P. The 
argument is rather tedious, but simple. 

The axioms of an ordered field are verified without difficulty. We shall 
limit ourselves to one example: “every nonzero number has an inverse,” 
Le. 


I¥x( Qe = 02300 = ih A (= 01v V_157= 11) 


ER 


To verify that this truth value equals 1, it suffices to prove this for each 
term on the right, i.e., for each fixed x € R. Then, in turn, for that x it 
suffices to construct a random variable y € R such that ||x = 0]|V || xy =1|| 
= 1. We set 
ee oe it €(w) #0, 2 
0, if x(w) = 0. 


3.6. PROOF OF 3.1(d). We first recall the formula for the CH: 
Wh(Ag Wy Ax(A(x) =0Ay = a(x) V 
Af Vy(h(y) = 0 3x(Z(x) Ay =f(x)))). 


We let P, and P, denote the first and the second alternatives in this 
formula. Thus, the CH has the form VWA(P,\/ P;). We must prove that 
|WACP, V P3)||(€) = 0 for any point € € M. By the definition in 2.7, 


IWACP,V Pp JIE) = /\ (Pill) V II Pall(€)), 


where é’ runs through all variations of € along A. To show that this value is 
0, it suffices to find a point & such that || P,||(é) = || P2||(€) = 0. Since all 
the variables except A are bound in P, and P,, choosing &' is equivalent to 
choosing A®’ = h € R“. We shall give h explicitly; this will be a function 
“whose set of zeros has intermediate cardinality.” 

To do this, as in §1 we fix a subset ¢ CJ having cardinality strictly 
between w, and card /. Recall that for each i€ J, x,€ R_is the “ith 
coordinate” function. Further, for each random variable x € R, we choose 


a subset Q(x) C Q such that 
VV ||* = Xl = Q(x) mod 0 
JE} 


(here we use the completeness of B). Finally, we define h € R“ as follows 
for every x € R andwEQ: 


pe 0, ifw E€ Q(x), 
h(x)(w) = 4” 
Ce) ey otherwise. 
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3.7. Correctness Lemma = _ 
(a) For fixed x, h(x) is measurable as a function of w, so that h maps R 


to R. = 
(b) For every x € R we have 
|| (X) =O] = ™ I|x = lI. 
(c) hE R™ (see 2.7), so that there exists a point & € M for which 
h® =h, 


PROOF. 

(a) h(x) only takes the values 0 and 1 on Q, and the set where it takes 
each of these two values is measurable by the definition and by the 
completeness of B. 

(b) is obvious from the definition. 

(c) We must verify that for all x, y © R we have 


{w €Q|X(w) = 7(w)} < {w € Qh (X)(w) = A(¥)(w)} mod 0. 


We shall show that the set of points w& Q for which both x(w) = y(w) and 
h(x \(w) # h( y)(w) has measure Zero. 
It suffices to consider the case h(x \(w = 0, h( y (w) = 1, Le., to show that 


LX = il Alla (%) = Ol] Alla (¥) = HI] =0. 


We write the second term in the form WV; e4ll* = || (by 3.7(b)) and apply 
the distributive axiom to the first and second terms (where we use the 
completeness of B). We further use the fact that ||x = y|| A ||x = x)|| < 
\|_¥ = x,||. We then obtain: 


= FIL Alla) = Ol < VIL = ll = AF) = OI, 
which immediately gives us the required result. oO 


Explanation. Since the choice of h is the essential step in the proof, we 
would like to give some motivation for this choice. Recall that A is the 
name of the function the cardinality of whose set of zeros interests us. We 
choose a concrete h to “disprove” the CH in such a way that the “almost 
everywhere zeros” of h include the elements of the set {x;|j € $}, which 
has intermediate cardinality in the naive sense of the word (compare with 
§1). However, A cannot be an arbitrary map from R to R; it must satisfy 
the strong condition_h © R. Hence, along with all the x, the almost 
everywhere zeros of A might also have to include various other y € R, and 
might have to “partly include” still other z € R. We say “partly include” to 
convey the possibility that ||4(z) =0|| is neither 0 nor 1, so that z has a 
“certain probability” of being a zero of h. 
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Thus, the “set of zeros” of # might be bigger than we want, and we 
might expect to encounter difficulties in proving that this set cannot be 
mapped onto all of R (the alternative P,). On the other hand, it would 
seem that this situation would make it trivial to disprove the alternative P, 
(mapping Z onto the entire set of zeros). But even this is wrong! As we 
noted before, we can have ||Z(x)|| = 1 for many x which are not constant 
integer functions on 2. Moreover, for still other ¥ we have ||Z(x)|| #0, 1, 
so that the “set of integers” in our model has grown considerably. 

A final remark: in this discussion we have been essentially dealing with 
the concept of a “B-random set,” which will be a central idea in what 
follows (see §4). That is, the “set of zeros of h” is random in the sense that, 
for each z € R, the assertion “z € (zeros of h)” is naturally assigned the 
Boolean truth value ||A(z) = 0|]. 

We now return to the proof that ||CH|| = 0. 


3.8. PROOF THAT ||P, ||(’) = 0. By the rules for computing truth functions, 
we find: 


\| Pull(é) = V /\ V {lA (%) =O AIF = a(x}, 


where A was defined above, g runs through all elements of R, and x and 
y run through all elements of R. We suppose that || P,||(€) 40, and show 
that this leads to a contradiction. We write the above formula for || P,||(’) 
as Vi a(g). _ 

If || P,||(€) #0, then a(g) #0 for some concrete function g € R“). We 
take this function g and set 


a= AVM, I= SIAM = 81) 


Here we have substituted \/ jeg (1 = Xl] for \|A(¥) =O|| using 3.7(b). 
Furthermore, we have ||x = x,|| A ||» = g(x)ll < ||y = (x) ||. Using this 
and distributivity, we find 


a</\ V/ y= 28(%)Il 
y Jes 
In particular, for each x; in place of y, we have 
a< Vi Ile = 8(%)Il 


If, as we have supposed, a #0, then for each 7 there exists a j(i) € ¢ such 
that 


IX, = B(Hey) || #0. 


Since J is uncountable and card 4 < card J, it follows that there exists a 
Jo © ¢ such that jp =/(4) for all i in an uncountable subset J) Cc J. But this 
contradicts the countable chain condition on B, because the terms in the 
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family ||x; = 3(%,,)|| (© Jo) are pairwise disjoint. In fact, 


x, = 2(& IAI, = 2(%,) || <e, = %, =o 
if i, #i,. i 


Notice to what extent this Proof parallels the “naive” argument in §1. 
By assumption, the function y maps the zeros of A onto R “with nonzero 
probability.” But the exact meaning of the computations cannot readily be 
stated in words. 

Computation of ||Z(y)||. The formula for Z(y), “y is an integer,” was 
given in 2.3. Since this formula occurs in P,, we must compute ||Z()|| in 
order to compute || P|I. 


3.9. Lemma. Let n © M and y"=y € R. Then 
IZ (»)II(m) = y= n|| = {w EQ y(w) € Z} mod 0. 


Proor. We must show that 


AIF = alr’ ( V IFG= f + Di')v lf) =o) = 


neEZ 


We prove this equality by proving inequality in both directions. 
The inequality <. It suffices to find a concrete function f€R for 
which the corresponding term on the left is contained in the nght-hand 


side. We define f by setting f(x)(w)=sin? 7x(w) (here, instead of sin? zz, 
we could take any measurable function with period | and zeros only at = 
integers). It is easy to see that f(x)ER and fER. Then \| f(0) =O)’ = 
and || f(x)= = f(¥+1)||’=0. Hence we need only verify that 


in? wy = 0 y=nil, 
Isin? ay = Ol] < \/. |= all 


and this is obvious. 
The inequality >. It suffices to show that, for any fixed values of n € Z, 
fe R and y € R, we have: 


|F= all <bVC, 
where 
b= IFO) =0N'V( VIF @=FE+DI)s = FH) =O. 


But the inclusion a < b\/ c is equivalent to a/c’ < b. Furthermore, in our 
situation we have 


afc =||y =n Alf (¥) = Ol! < If (2) =O". 
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(Here n in f (n) is the constant random variable which is everywhere equal 
to n.) 
It is thus sufficient to see that 


WF (m) =O" < IFO =OV'V( VIF =F +I) 


or, taking complements, that 


IF) =O} > IFO) =A AF) =F (% +I). 


The right side can only become larger if we only take the intersection over 
the terms with x = 0, 1, 2,...,-— 1. But this obviously gives 


If (0) =O) AIFO=FU)=... =f (I < Lf (2) =O}. oO 


3.10. PROOF THAT || P,||(€’) = 0. Using Lemma 3.9 and the rules for comput- 
ing truth functions, we find: 


IK) = V A (I) = rv V (Ve = aA =F I). 
fos hye 


Since fe R, we have ||x =n] < || f(*%)=f(a)||, so that ||x=n||/A 
7 = SM <P =O 

Now it suffices to prove that the term corresponding to any concrete 
choice of f is equal to 0. We suppose that this is not the case, and show 
that we obtain a contradiction. Let a #0 be the term corresponding to f. 
By the previous paragraph, we have 


a<A(Ih(9) =O VY 7 =F (mil): 
y n 
In particular, for every j € { we must have (with x, in place of y). 
a<\/ =f (n)l 


(where we have ||A(x,) = 0|| = 0 by 3.7(b)). Hence, for every j there exists 
an integer n(j) such that 0 # ||x, = f(n(J))||. Since } is uncountable, there 
exists an ny and an uncountable subset 4, C ¢ such that n(jo) = no for all 
Jo © Jo. Then the ||x,=f(no)|| for 7 € {, form an uncountable set of 
pairwise disjoint nonzero elements of B. This contradicts the countable 
chain condition on B. OD 


4 Boolean—valued universes 


4.1. In this section we fix a complete Boolean algebra B (see 2.6) and 
construct the universe V¥ of “B-random sets.” It will be a model for the 
Zermelo—Fraenkel axioms in the same generalized sense in which the 
random variables R were a model for the real numbers R in §3. In §§5-7 
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we verify that all the axioms of LSet are “true,” and then in §8 we verify 
that the Continuum Hypothesis is “false” for a suitable choice of B. 

The objects of V* will be denoted by capital letters X¥, Y, Z,--- . Any 
two objects determine elements ||X € Y||€ B and ||X = Y||€B. The 
intuitive meaning, say, of the first of these is as follows: if B is the algebra 
of measurable sets in a probability space, then || X € Y'|| is the maximal set 
on which “X is an element of Y with probability one.” Since we do not 
deal with probability measures in the general case, we shall simply call the 
elements of B “probabilities,” and then ||X € Y'|| is simply the probability 
that X belongs to Y. 

It is not trivial to construct precise definitions, because we want the 
axiom of extensionality to be “true.” If a random set must be uniquely 
determined by its elements (which are also random), even in a generalized 
sense, then this random set cannot be “too” random (see 4.3). 

We shall assume that as a set B is an element of the von Neumann 
universe V. Then all the objects of V? will also be elements of V, and all 
our constructions can be expressed in L,Set. In principle, this allows us to 
take a more formalistic point of view than we shall in fact take. The proof 
given below of the independence of the CH could then be used as a guide 
for constructing a much more syntactic version, based on an “internal 
interpretation” of the language L,Set in itself. In this context the assump- 
tion that the Zermelo—Fraenkel axioms are consistent in the statement of 
Theorem 1.6 becomes a necessary precaution, since (by Gédel’s result) this 
consistency cannot be established using only the language L,Set itself. 
However, in our treatment this condition is pure hypocrisy, since by 
assuming the “existence” of the universe V, which is a model for the 
axioms, we automatically “prove” that those axioms are consistent (see 
subsection 18 of the Appendix to Chapter II). 


4.2. Construction of V®. For every ordinal a we construct the set V2 by 
transfinite recursion, and then set V7 = U V2. The first step is: Ve = ©. 

Inductive assumption. The set V2 is defined for the ordinal a > 0; for 
every element X € V7 the set D(X) C V3 is defined (its intuitive meaning 
will be explained below); for every pair of elements X¥, Y € V2 the 
“Boolean truth functions” 


|IxeY eB, |X=Y|EB 


are defined (intuitively, they should be thought of as the “probability that 
X is an element of Y” and the “probability that X coincides with Y,” 
respectively). 

By assumption, this data satisfies the following conditions: 


(a) If 8, < B, <a, then VP < Vi. 
(b) If B<aand X EV#,,\ VP, then D(X) = VP. 
(cy) IX EY = VeepalllX = ZIAIZ € YI) Da 
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(the condition (1), expresses the requirement that the formula x Eye 
Az(x =z/\z€y), which is easily deduced from the Zermelo—Fraenkel 
axioms, must be “true’). 


(4) IX=YI=( A. IZEXI/ VIZ € YI) 
ZED(X) 


NCW NZEYIIVIIZEX) Qa 
ZED(Y) 


(this condition expresses the “truth” of the formula x = ye@(Wz(z E xz 
Ey)AVz(z €y=z € x)). We note that it is not completely clear at this 
point why, for example, in (1), we only took the union over Z in D(Y); it 
would seem natural to take all Z. Later we shall see that the formula 
remains true if we take the Boolean union over all Z. 

This completes the description of the data for V2. We now give 
explicitly the recursive construction of V3, , and the corresponding data. 


Definition of V2, , and D. We set V3,, = V2 U V2, where V2; , consists 
of all possible functions Z with domain of definition V2 and range of 
values C B which satisfy the following “extensionality condition”: 


X= YI AZ(X)=||X=Y||AZ(Y), forallxX,Yev. (3) 


A little later we shall define ||¥ € Z|| = Z(X) for X © V3 and ZE 
VE. ,\ V2. Thus, as before, (3) can be thought of as reflecting the 
formula 


(x=y Ax Ez)e(x=y Ay Ez). 


Compare also with the comment in 2.7 concerning the definition of R“”. 

We shall call the elements of V3, ,\ V2 new elements (of rank a + 1), 
and we shall call the elements of V,3 old elements. We set D(Z) = V3 if 
Z is a new element. 


Definition of the Boolean truth functions. These functions have already 
been defined for pairs of old elements. We further set: 


|X © Y||= Y(X), if X is old and Y is new; (4) 
ix=Yi=() A izexi'viize ri) 
ZED(X) 
A( A |IZEYI'ViIZ ex). (5) 
ZED(Y) 


Because of (2),, (5) automatically holds if X and Y are both old 
elements; in the other cases, (5) uniquely determines ||X = Y'|| if we use 
(4) and the fact that Z only runs through old elements in (5). Finally, we 
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set: 


IxXEeYT= Vo IX=ZIAllZ € YI (6) 
ZED(Y) 


if X is a new element and Y is either new or old. The right side is 
uniquely determined using (4) and (5), since D(Y) c V2. 


Formulas (4) and (6) show the following. As a first approximation we 
might say that a random set Y of rank a “consists” of sets Z of lower rank 
which occur in Y with probability Y(Z); these probabilities can be chosen 
rather arbitrarily, subject only to the extensionality condition (3). 

However, we then find (in formula (6) for new X and old Y) that we 
must automatically “include” more and more elements X in Y with 
probabilities already assigned by formula (6). It is conditions (3) and (6) 
which prevent our sets from being completely random. 


Definition of V? and other data for limiting ordinals a. We simply set 
VE=Upea¥#, and then all the other data has already been de- 
termined. 


4.3. Verification that the definitions are correct. Properties 4.2(a) and (b) are 
obviously preserved in going from a to a+ 1; we must verify (1),4,, and 
(2),41;- Now the only identity here which is not completely obvious is 
obtained by taking X old and Y new in (1),4.: 
Y¥(X)= Vi |X=ZIAY(Z). 
zeve 
This is verified as follows. We obtain > by writing the nght-hand side in 
the form \/5||X = Z|| A Y(X) using (3). We obtain < by considering the 
term with Z = X and taking into account that ||X = X || =1 for all X (as 
follows immediately from (5)). 
This completes the construction of the Boolean-valued universe. 


4.4. EXAMPLES AND REMARKS. We examine some special cases of these 
constructions in order to clarify their structure. 

(a) Obviously V? = {@}, since there exists a unique “empty” function, 
whose domain of definition is the subset V2 = @. We compute V? = V? 
U VB". We let {G}, € VP" denote the function of the one-element set V? 
which takes the value b € B. All these functions are extensional, so that 

V2 ={O, {GO}, forallbe B}. 
It follows from (4) that 
IID € {}, | = &. 
It is clear from (5) that 


IO = {O},l =D’. 
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Intuitively, these formulas mean that {@}, consists of one element @ 
“over b” and is empty away from b. Again applying (5), we find: 


{OD} = {Dhol =(@ VO)ACaV b) =(aAb)V(a' Ab’). 


Thus, {@}, and {@}, coincide when either they are both empty or they 
both consist of one element @: this agrees with intuition. Now applying 
(6), we find: 
I{D},€ {D},ll = l]{O},= SI AIO €{O}, || =a’ Ab 

(i.e., the only possible inclusion, which has the form @ € {@}, holds when 
{@}, is empty and {@}, is nonempty). 

Finally, let X¥ € V;" be an extensional function on the subset V? with 
values in B. Then, by (6), 


|X € {OD}, |] = |X = SI AIS € {O}, || =X =| Ae, 
and by (5) 


Ix=o)=( A Io}, XI)Allo ex | 


aces 
=(V,I{},€XIIVIBEXIY. 


Thus, intuitively, || = @|| means the complement of the support of X in 
B, and ||X ©{@},|| is the set where both X is empty and {@}, is 
nonempty, which again agrees with the usual formula @ € {@}. This 
shows how new objects X can be random elements of old objects with 
nonzero probabilities. 

(b) We consider the case B = {0,1}. The corresponding probability 
space consists of one point, so our random sets become completely 
determined. What happens is: the universe V* maps naturally onto the 
von Neumann universe V in such a way that, if X denotes the image of 
X € V®, then all X and Y satisfy the conditions: 

|X EY =1leX EY, 
|X= VY =loxX=¥. 

To construct this map we first set = @. We now suppose that the map 
v{°1)_, V, has already been constructed with the required properties, and 
we extend the map to a + 1. To do this, for any new element X &€ V,{2,) 
we first find the subset of V,{° '! on which X takes the value 1, and we then 
take the image of this subset in V,, which is an element X of P(V,) = 
V,,.; by definition, our map takes X to this Y. We leave the verification of 
the properties of this map to the reader. 

(c) Boolean truth functions for the formulas in L,Set. 

We define these truth functions in an analogous manner to §2. We 
introduce the interpretation class M: each point £ € M assigns to every 
variable symbol x in L,Set some object x* = X of the universe V7. We 
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further assume that every point £ maps the symbol @ in L,Set to the empty 
set. 

If P is the atomic formula x Ey or x =y in L,Set, then ||P||(€) is 
defined to be ||x* € y*|| © B or ||x* = y*|| € B, respectively. The value of 
||P |\(§) for all other P is defined inductively using exactly the same 
formulas as in subsection 2.7. We need only note that, although the 
expressions \/, a, and /\, a, must be taken over families indexed by the 
class M when we compute with quantifiers, all the different elements of 
such a family form a subset of B, so that such an expression makes sense. 
We shall call a formula P “true” (in the model V”) if || P||(€) = 1 for all 
and we shall call P “false” if ||P ||(€) =0 for all € 

As in §3 of Chapter II, it can be verified that all the tautologies and 
logical quantifier axioms are “true” and that the rules of deduction 
preserve “truth.” Hence, it remains for us to show that the Zerme- 
lo-Fraenkel axioms are “true” (for any B) and that the Continuum 
Hypothesis is “false” (for suitable B). 


5 The axiom of extensionality is “true” 


We begin by proving some relations between the truth functions. First of 
all, it is clear from formula (5) in §4 that ||X = Y||=||Y=X|| and 
||X = X || = 1. The following lemma is a less immediate consequence of the 
formulas. 


5.1. Lemma. For any X, Y, Z © V® we have: 


IX= FIA = 211 < X= ZI), (1) 
IX = VIIA Y € Z|] < |X €ZII, (IT) 
IX EYILAIY = ZI < |X © ZI. (IIT) 


PROOF. 
(a) (IID) holds if X © D(Y). In fact, then by formula (5) in §4 
Y= Z] < |X EVV VIX EZ], 
so that, if we intersect both sides with ||X € Y||, we obtain (IID. 


(b) (IIT) holds if X, Y € V3 and Z is a new element of V2, ,. In fact, we 
choose U € D(Y) and apply the special case of (III) proved in (a): 


WU EVIAIY= Z| <|U EZ]. 


We take the Boolean intersection of both sides with ||X = U|| and then the 
Boolean sum over all U € D(Y). Now applying formula (6) in §4 to the 
left-hand side and using distributivity, we obtain: 


IX EYIAIY=ZI< VO X= UIA UE Z| 
UvED(Y) 


< MV _C=IIX=UIAIU EZ =X EZ]. 


UED(Z)=VE 
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(c) (1) holds in V2,, if (II) holds in V2. We consider an element 
U € D(X) € V3. By (a), we have 
WU EXIAIX= Yi < Ue YI. 
We take the Boolean intersection with || Y = Z|| 
WU EXPAIX= VYIAIY= ZI <0 E YIAIY = ZI). 


Here the right side is always < ||U € Z|j. In fact, if Y € V2 this follows 
by part (b) or by the induction assumption, and if Y is a new element of 
V3_, then it follows by part (a). 

We have thus shown that for all ¥, Y, Z € V3, and all U € D(X): 


JU EXIAIX = YIIAIY = Zi <0 © 2]. 


Because a/\b <c implies b < a’\/c in any Boolean algebra, we then 
obtain 


IX=YIAWY= ZI <U Ex Vile eZ], 
and hence 


IX=YIAIY=Zi< A) UeXx|/ vil eZI. 
UED(X) 


Interchanging X and Z, we find that for all U © D(Z) 


IZ=YIAIY=XI< A ueZzi/ Vii ex]. 
UED(Z) 


These last two formulas, together with (5), clearly imply (1). 
(d) (II) holds in V., , if (1) holds in V3, ,. In fact, let U € D(Z). By (D, 
we have 


IX = VIIA Y= Ul] < |X = UI. 
We take the Boolean intersection with || U € Z|| and then the Boolean sum 
over all U € D(Z): 
IX=YIAC, Yu eZziAly= ull] 
UVED(Z) 


K< V_ IZ=UlAlU EZ. 
UvED(Z) 


Applying (1),,, in §4, we obtain (II). 
(e) (III) holds in V3, , if (II) holds in V3, ,. In fact, let U © D(Y). By 
part (a), we have 


JUEYIAIY=2Z| <|U © ZI). 


Intersecting with ||X = U|| and applying (II) to the right-hand side, we 
obtain 


|IX=UIAIUEYIANY = 2] < |X © ZI. 
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Finally, if we take the Boolean sum over all U € D(Y) and use formula 
(1) in §4, we obtain (III). oO 


Obviously, parts (a)-(e) prove the inductive step for a to a+ 1. We are 
now in a position to establish the basic result of this section. 


5.2. Proposition. The axiom of extensionality 


x =yev2z(z€xezeEy) 


” 


is “true. 


ProorF. The formula || P< Q ||(&) = 1 is equivalent to || P||() = ||Q||(€). It is 
therefore sufficient to prove that for all X¥, Ye Vv? 


IX= Y= A IZ] XIVIIZ € YINACIZ EXIVIZ € YI. 


The inequality > follows immediately from formula (2) in §4. To obtain 
the opposite inequality, we write two obvious corollaries of formula (ITI) in 
Lemma 5.1: 

|IX= YI < |Z EX VIZ € YI, 

|IX= YI <|ZEXWVVIZ € YI, 


and we take the intersection over all Z. The proposition is proved. Oo 


We note that formula (2) implies the following general extensionality 
property: for all X¥, ¥, Z EV? 


X= VIA € Z| =X = YIAIX € ZI. 


5.3. Corollary. The axioms of equality in L,Set are “true.” 


In fact (see Proposition 4.6 in Chapter II), the axioms of equality in our 
case consist of: the “true” formula x = x, the axiom of extensionality (in 
the form x = y>(P(x)=> P(y)) with P(x) =z €-x), and the “true” for- 
mula x =y>(x €z=>y €z) (in which P(x)=x €z), since the only 
atomic formulas P(x) in L,Set are z © x and x © z. oO 


5.4. Remark. In most computations, we shall only need to know the values 
of ||X € Y|| and ||X = Y]|, and not the precise definition of the objects X 
and Y. In this connection, we note that the following two binary relations 
on V% coincide (as easily follows from (III) and the axiom of extensional- 


ity): 
(a) |X = Yi] =1, 
(bl) VWZEV3, |ZEX|=|ZEYI. 
We shall call such X¥ and Y equivalent and write X ~ Y. 
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6 The axioms of pairing, union, power set, 
and regularity are “true” 


6.1. The computations in the previous section show that the basic work in 
ensuring that the axiom of extensionality is “true” was already incorpo- 
rated into the definition of the universe V¥. The explicit formulas for 
recursively computing ||X € Y|| and ||X = ¥|| reflected so many special 
properties of inclusion and equality that together they guaranteed that the 
general axiom must hold. 

In order to verify several of the other axioms, we must essentially define 
in V? analogues of certain operations in V, such as forming the unordered 
pair, the set of subsets, and so on. These operations can be defined by 
means of formulas in L,Set. However, recall that, if P(x) is a formula with 
one free variable x, then the x* © V for which P(x)(é) is true generally 
form a class and not a set. 

It will be convenient to introduce the auxiliary notion of a “random 
class” in V*. Using this concept, we shall often construct the operations in 
V® in two stages: the value of the operation will at first be a random class, 
which we then “identify” with a random set using a separate argument. 


6.2. Definition. 
(a) A random class is any function W on V? with values in B which 
satisfies the following extensionality condition: 


W(X)A\|X=Y||=W(Y)A||X = YI, forall X, Ye v?. 


(b) A random class W is said to be equivalent to a random set 
Z EV? (written W~ Z) if 


W(X)=||X © Z|, forall X ev?. 


6.3. EXAMPLES AND REMARKS 

(a) For any random set Z the function X¥}> ||X € Z|] is extensional by 
(II), §5, and so is a random class. By analogy, we often write ||X © W|| 
instead of W(X) if W is any random class. 

(b) There exist random classes which are not equivalent to random sets. 
One such example is the “universal” random class W(X) = 1 for all X. (If 
W were a set, we would have ||W € W|| = 1, contradicting the regularity 
axiom, which will be shown to be “true” below.) 

(c) Let W be a random class, and let a be any ordinal. We define the 
element W, & V2, , as follows: 

D(W,) = V2, W, = the restriction of W to V3 (as a function; see 4.2). 
It is easy to see that for all ¥ € V® we have: 


|X © W,\| < |X © WIL. (1) 
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In fact, let U € V3 and X € V¥. We then have 
|X =U| AW, (U) = |X = UA W(U) = ||X = UI AW(X) < W(X), 
so that, by (6), §4: 


|x € WI = alae UIA W,(U) < W(X) = (|X © WI. 

We shall often show that some class W which we are interested in is 
equivalent to a set by finding an ordinal a such that W~ W,, It is clear 
from (1) that this follows if ||X € W|| < ||X © W,]|| for all X. 

(d) Let W, W,, and W, be random classes. Then W’, W, A W,, and 
W,\/ W, are also random classes, since the extensionality condition is 
trivially verified for these functions. We shall write W,n W, and W, U W, 
instead of W, A W, and W,\ W,, respectively. 

(e) Let W be a random class, and let X be a random set. We show that 
W 1 X is equivalent to a random set. More precisely, if D(X) = V2, then 
WnX~(WoX),. In fact, for any Y € V® it follows by (6), §4 that: 


Wye nX) l= OV |U= YIIAIU EWN X) ahh 


evs 


@ 


= ele YIAIYVEW)AlUE Xx 


= V WU=YIAIY EWI AU Ee x|| 


vevs 
=|YEWIAIYE X= Y EWn XI. 


This result implies that the separation axioms are “true” (see subsection 
4.9(b) of Chapter IT). 


The following proposition gives a general method for constructing 
random classes. 


6.4. Proposition. Let P(x, y,,..., ¥,) be a formula which does not contain 
any free variables besides x, y,,...,¥,. Let Y;,..., Y, © V® be fixed. 
Then the function 

Xp W(X)=||P(X,¥,.-., Yl 


is a random class. 


Intuitively, W contains every set X with probability equal to the 
probability that P(X,..., Y,) is true. Y,,..., Y, play the role of “con- 
stants.” 


Proor. We use the “truth” of the following axiom of equality: 
[Wx Wy = Wy (x =y(POa yp. I) PP(W I++ Yn))L=T- 


If we take a point € in the interpretation class which assigns to x, y, 
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Yp.+.sy, the values X, Y, Y,,..., ¥,, respectively, then we find that 
IX= YU < || P(X Y,.... YOIV IPO, Yee. YI 
or 
|X = VI AW(X) < WY), 


so that W is extensional. O 
We are now ready to verify the axioms. 


6.5. Proposition. The axiom of pairing 
Vu Vw ax Vz(z €xe@z=u\V/z=w) 
is “true.” 


Proor. By definition we have 
Wu Ww Ax Wz(z EC xeez=u\/z=w)|| 


AAS V (\\|ZEXaeZ=UVZ= WI. 


Hence it suffices if, for any U, W © V®, we find an X € V® such that for 
all Ze Vv? 


IZ EX =|Z2= UV IZ = WI. (2) 
For fixed U and W we consider the right side of (2) as a function of Z. 
This function is a random class X by Proposition 6.4, since it corresponds 
to the formula z = U\/z = W. We show that it is equivalent to a random 
set; more precisely, if U, W € V3, then X ~ X,. By the remark at the end 
of 6.3(c), it suffices to verify that for all Z 


|Z EX|| < |Z EX, |]. 
But since ||U € X,|| = 1, it follows by formula (II) in §5 that 
|Z= Ul] < 12 EX,IL, 


and similarly 


|Z = WI < |Z © X, |), 
which gives the required inequality. oO 


6.6. Proposition. The axiom of union 
Vx Ay Wu(az(ueEzAzexy)euey) 
is “true.” 


Proor. We fix ¥ € V? and construct a random set Y such that for all 
UevV? 


|U € ¥|| = |Az(U See Ev WUE ZIAIZ & XI). 


evs 
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By Proposition 6.4, there exists a random class Y with this property. We 
show that if D(X) = V3, then Y~ Y,. Since D(Y,) = D(X), we have 


WEYN= Vo WO=ZIAIZ € Yall 
ZED(X) 


= YY, W=ZIAL VIZ EZ AIZ EX} @) 
xX) Z B 


ZED( 1EV 


We show that the inner sum in (3) may be taken only over Z, € D(X). In 
fact, for any Z, 


IZWEXI= — Vo WZ, = ZI ANZ, € XI, 
Z,ED(X) 


so that 
IZEZIWAIAEXI= Vo IZEZNAWZ = ZI AIZ2 © Xl 
Z,ED(X) 


< Vo IZEZIAZ. € XI). (4) 
Z,ED(X) 

Taking this into account, in (3) we first sum over Z for fixed Z, € D(X). 
Since D(Z,) < D(X), the sum over Z € D(X) coincides with the sum 
over Z € D(Z,), and is equal to ||U € Z,||. Thus, 


IUEXMI= Vo WVEZ HAZ, € Xl 
Z,ED(X) 
= V WVEZINAIZ, EX =U € YI, 
Z,eve 
by (4). O 


6.7. Proposition. The power set axiom 
Vx Say Wz(z Cxeezey) 
is “true.” (Recall that z C x is abbreviated notation for Vu(uE z>u € 
x).) 
Proor. We fix X € V? and construct a Y € V® such that for all Z € V? 


IZ EY =|2 CXS AN EZ VAS AL 


By Proposition 6.4, the right side defines Y as a random class. We show 
that, if D(X) = V3, then Y~ Y. 


a atl 
We first construct the element Z, € V3, , by considering Z as a random 


class. By (1) we have ||U € Z, ||’ > ||U © Z||; so that 
IZE VI <2, € YI = 2, © Vasil. (5) 
If we prove the inequality 


IZ E¥||< |Z, = ZI), (6) 
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it will immediately follow from (5) and (6) that Y~ Y,, ;, since, by (IJ), §5, 
IZEVYI SWZ. € Ya illAIZ. = Zi < 12 © Yosall: 

It remains to verify (6). 

First let U € D(X) = V3. Then ||U € Z,||=||U € Z|], so that ||U € 
Z,2U €Z|\'=0, and a fortiori 

IU EX| AIG EZ,cV EZ =0. (7) 

As U varies, the left side of (7) determines a random class of the form 
X | W, where W corresponds to the formula (u € Z,au € Z). Since 
D(X) = V2, it follows by 6.3(c) that ¥ 9 W~(X mM W),. But, according 
to (7), (X M W), is the zero function on V3. Thus, ||U € X m W|| =0 for 
all U € V8, Consequently, 


|U EX <||UEZ,eU EZ forall U. (8) 


To prove (6), we now write the left and right-hand sides separately (using 
the “truth” of the formula Z, = Z2Vu(u € Z, ou E Z)): 


IZErl= A vez vivex, 
verve 


IZ.=Zll= A UEZeUEZI. 
U 


evs 


It is now clear that the inequality in (6) holds term by term. In fact, for 
||U © X || this follows from (8), and for ||U © Z||’ it follows because 


IVEZeUEZ|=([UEZIN VU EZ) 
AQU EZ, VU € ZI), 
and ||U € Z||' < ||U & Z, ||’ for all U. Oo 


6.8. Proposition. The regularity axiom 
Vx(ay(y € x) > 4y(y ExAyNx=@)) 


is “true.” 


Proor. We fix X € V¥. The axiom with the “constant” X in place of x has 
the form R= S. We must show that || R=> S|| = 1. It suffices to prove that 
|R || Al] Sl’ = 0, where 

WRl= Vi IY eX, (9) 


eva 
isi= A vexirv( V Ze YIAIZEXI). (10) 
yevé zevé 
We suppose that ||R|| A ||S||' = a0, and show that this leads to a 
contradiction. It follows from (9) and (10) that there exists a Y € V¥ such 
that || Y € X || Aa #0. We choose Y to have the least rank of any element 
with this property. 
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It is again clear from (9) and (10) that 


IYEX|lAa< V IZEYIAIZEX*|. 
zZevs 


On the right we may sum only over Z € D(Y), without changing the value 
of the sum. Hence, there must exist a Z © D(Y) such that 


IZ EX AY EX Aa#0, 


so that ||Z € X||/\a #0. But the rank of Z is less than the rank of Y, 
contradicting the choice of Y. oO 


7 The axioms of infinity, replacement, 
and choice are “true” 


7.1. We begin this section by describing two more methods for construct- 
ing random sets. The first of them, which is very widely used, solves the 
following problem. Suppose we are given a set of objects X; € Ve,ieEl, 
and a set of elements a, € B. We would like to construct a random set X 
which contains each X, with probability a,, but such an X might not exist. 
However, it turns out that there always exists an X with ||X, © X || > a, for 
all i € J; moreover, there exists a least X with this property. 


7.2. Lemma. 
(a) Under the conditions in 7.1, the function X of Y 


IYEX= VaAllY = ill (1) 


is a random class X which is equivalent to a random set. In addition, 
|X, € X|| > a,, and, if X’ is any random class such that ||X; © X'|| > a; 
for each i, then || Y € X’|| > ||Y © X || for all Y. 

We shall say that X (or the equivalent random set) collects the X, with 
probabilities a,. 

(b) Under the same conditions, the function Z of Y 


IYEZ|= Va All¥ € Xill (2) 


is a random class Z which is equivalent to a random set. If we also have 
a, a, = 0 for all i # j, then \|Z = X;|| > @;, and, for any random class Z' 
such that \|Z’ = .X;,|| > a, for each i, we have || Y € Z’|| > || Y © Z|| for all 
Y. 

We shall say that Z glues together the X, with probabilities a,. 


ProoF. It is easily verified that the functions Z and X defined by formulas 
(1) and (2) are extensional. 
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There exists an ordinal a such that X, € V3 for all i. We show that 
X~X, and Z~ Z,. For any Y € V® we have: 


IYEXM= Vi Y=UIAIU € Xai 
vers 
= Vo VIY=UIAG AU = Xi 
vevs ! 
= Vi VaAlY=XiIlAllU= Xill- 
vevs i 


a 


If we consider the term with U = X, on the right, we obtain a, A || Y = X;|| 
<||Y EX,||, so that || Y ©X]| < || ¥Y © X,|| by (1), and the assertion 
follows by 6.3(c). 

Similarly, for any Y € V? we have 


| Y €Z,|| a VIF = UIAG Alu € Xill 
eve i 


Vi VaAlyY EX IAIY = UI. 
verve i 


Since ||Y €X,|= Vyeys |Y = UIA Y €X;l, it follows that a; A || Y 
EX,|| <||¥ © Z,||, and ||¥ © Z|| < |Y © Z,|) by (2). 

Now let X’ and Z’ be any random sets with the properties in (a) and 
(b). It is clear from (1) that ||X, © X|| > a. If ||X, © X’|| > a, for each i, 
then ||YEX'|= Vy Y= UI AIG EX | > V, IY =XNAIX EX 
> || © Xj] by (1). 

Similarly, if a, \ a, = 0 for i#/, then it is clear from (2) that a; \ || ¥Y € 
Z\| =a, A || Y © X;||, so that 


a; \\|X;= Z\|= \/aA||¥ €X;eY €Z|| =a, 
Yy 
and ||X; = Z|| > a,. Now if ||X; = Z’|| > a, for each i, then 
IY EZ > | YEZ AZ’ = Xjl| 
=||Y EX NAIZ = Xl > a AlY € Xl), 

so that || Y € Z’|| > ||Y € Z|. oO 

Here is our first application of Lemma 7.2(a): 
7.3. Proposition. The axiom of infinity 

Ax(@ Ex AWu(uex>{u} €x)) 
is “true.” 


Proor. When we proved that the axiom of pairing is “true,” we con- 
structed for any U, W © V® an element Z € V¥ (unique up to equiva- 
lence) with the property that || Y € Z|} = || Y= U \V Y = W|| for all Y. It is 
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natural to let {U, W}* denote this element Z, and let {U}? = (U, U}?. 

We now verify the axiom of infinity. We set X) = @, X, = 
{O}3,...,X, ={X,_1}*,... . Further, we let X © V® be the slement 
which collects all the X, with probabilities 1. We show that 


ID EX AWu(uEeX>{u} EX) =1. 


It is obviously sufficient to prove that for all U € V? we have ||U € X|| < 
|{U}? © X ||, that is, by (1): 


reall U=Xill < V2oll(u}?= Xill- 


In fact, since the formula u = x<{u} = {x} is “true,” and since X,,,= 
{X,}*, it immediately follows that 


PU = Xi = CU 7 = Xai. oO 


7.4. Lemma. Let W be a random class. Then there exists an element X € V? 
such that 


VV W(U) = WX), 
UevV 


The left-hand side may be represented in the form ||Ax(x € W)|| = ||W 
# @||. Hence, intuitively, the lemma says that the probability that a given 
class is nonempty coincides with the probability that a suitable element 
occurs in it. 


Proor. We first show that there exists an ordinal 8 such that 
Vuevrs W(U)= Vuev W(U). In fact, let a, = Vuevs W(U), and 
for any a€ B set y(a)= min(y\a, > a) (or y(a) = if a, Za for all y). 
Finally, set a sUp,e, Y(a). This is an ordinal, because B is a set. If 
y > B, then a, > ag by monotonicity, but we cannot have a, > ag because 
of the choice “of B. 

Thus, let \/, W(U) = Vuevs W(U). We index all the elements in 
Vy by an initial segment of ordinals (by the axiom of choice!): V2 = 
{U. hae, We set 


=WU)A(VW(U,)), wer 
Obviously a, /\ a, = 0 for a # y. Using Lemma 7.2(b), we glue together the 
sets U, with probabilities a,(a EJ). We obtain a set X satisfying the 
conditions ||X = U,|| > a4, > W(U,). Using the extensionality of W, we 
find: 
W(X) > VIX=UIAW(U,)= YW(U,)= V WU). O 
acl a&l uev8 
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7.5. Proposition. The replacement axiom 
Vz Wu(Wx(x Eu=aly P(x, y,Z)) 
=> dw Vy(y Ewe dx(x Eu/ P(x,y, z)))) 


is “true” (here z =(2Z,,..-5 2,>): 


Proor. We fix a “vector” Z=¢Z,,..., Z,> with Z,€ V? and an element 
U EV. We shall write P(x, y) instead of P(x,y, Z). If we write the 
axiom with the “constants” Z, and U in the form R=>S, then we must 
prove that ||R=>S|| = 1. 


7.6. The special case: If ||R|| = 1, then || S|] = 1. 

We first show how the general case follows from this special case. Let 
a€B, and let B, denote the set {6 € B|b < a}. The operations on B 
induce a Boolean algebra structure on B, with unit element 1, = a. The 
natural mapping B—> B,: b+» b /\a is a homomorphism. An easy induc- 
tion on a allows us to construct a surjective map of universes V2 > V*; 
X > X, such that for all ¥, Y € V? we have 

|X2 © Yall = ||X © Vil Aa, 
|Xq = Yall = || = Yi Aa. 

Now, to prove Proposition 7.5 from the special case 7.6, we choose 
a=||R||. Then ||R||, =1,, so that 7.6 implies that ||S||, =1,. This means 
that ||S|| > a, and hence ||R=>S|| = 1. (Here we have used 7.6 in V®; 
clearly ||R||, =||R,||, where R, is the obvious image of R in V®.) 


7.7. PROOF OF 7.6. The condition ||R || = 1 means that for any X¥ € V? 
|X © Ul] < ||Aly P(X, y)IL- (3) 


To show that ||S|| = 1, it is sufficient if, given U € V¥, we finda WE V? 
such that for all Y € V? 


|¥ € W|| =v Ie EU AI PCX, Y)I- (4) 


It follows from 6.5 that the formula (4) defines W as a random class. We 
find an ordinal a such that W~ W, 


a 


To do this, we first note that in (4) we may take the sum only over 
IYEw|= Vo IX EUIAIP(X Y)I (5) 
xX ED(U) 


(the argument here is the same as after formula (3) in §6). We now apply 
Lemma 7.4 to the class Wy(Y)= ||P(X, Y)||. It follows that for every 
X € D(U) there exists an element Y, € V® such that 


Ay P(X, yl = POX Vx DIL (6) 
(Because |/A!y P(X, y)|| < ||Hy P(X, y»)||, we can use these Y, to estimate 
\||X © U|| with the help of (9) below.) 
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We set ay = min(al Yy € V3), and 
a =sup(ay|X € D(U)), 


and then show that W~ W, for this a. We must verify that || Y © W|| < 
|| Y € W,|| for every Y. By (5) and by formula (II) in §5, this follows if for 
any X € D(U) we have 


|X EU AIPCX, YY < IY = YyllAll Yx © Wall: (7) 
In the first place, by (3), (6), (5), and the definition of a, we have: 
|X € UI] < ||P(X, Yy Dl (8) 

|X EU] < || ¥y €& WI =| Yy © Wl. 


Further, we consider the following formula, which is “true” because it is 
deducible from the logical axioms and the axioms of equality: 


Wx(Aly P(x, y)A P(x, yi) A P(X, 92) 21 = 2): 
We thereby obtain: 
Fy P(X, WNAIPOS YOIAWP OG Ye DI < Y= Yell (9) 
Finally, it follows from (3), (8), and (9) that 
|X € UIAIP CX, YI < Y= YellAllYx © Wall 
ie., we have (7). O 


7.8. Proposition. The axiom of choice is “true.” 


ProoF. Recall that the axiom of choice has the form Vx AvV(QARASA 
T), where 


Q denotes: Wz(z © y=>du Jw(z = <u, w))) (“y is a binary relation”); 

R denotes: Vu Vw, Vw2(Cu, w > Ey ACU, W.> Ey>w,=w,) (“y is a 
function”); 

S denotes: Wu(Aw(<u, w> Ey) u € x) (“the domain of definition of y 
is contained in x”); 

T denotes: Wu(u4¥ O@ Au EG x>4w(weEu/<u, wy Ey)) (“the domain 
of definition of y coincides with x, and y chooses one element 
from each nonempty element of x’). 


We fix X & V? and construct the corresponding “choosing function” Y. 
To do this: 


(a) We index D(X) by an initial segment of ordinals: 
DIX) = {0g Uses Upwey CEE 


(b) For each U, € D(X) we use Lemma 7.4 to find an element W, € V? 
such that 


|W EU = Vo We U,ll. 
wev® 
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(c) For each a EJ we set 
aq = (Un XIA( VU p XI VIUp = Vall’). 


(d) Finally, we let Y denote the set which collects the “ordered pairs” 
<U,, W,>? with probabilities a,, a € J. Here, of course, 
(U, WF ={{U}?, {U, Ww}? } . 


a? 


The idea of this construction is as follows. In each U, we choose the 
element W, which belongs to U, “with the largest possible probability.” 
We then put together the graph of the choice function Y from the “‘pairs” 
<U,, W,>®, where we take the pairs in the order they are indexed, but only 
include a given (U,, W,>® to the extent that U, “was not already consid- 
ered earlier as belonging to X.” 

We now substitute Y and Y in place of x and y in the axiom of choice, 
and, letting Q, R, S, and T now denote the corresponding formulas with 
these constants, we show that ||Q|| =||R|| =||S|| =||7||=1. We shall 
constantly be using the following formula, which follows from (1) and the 
definition of Y; 


|Z € Y= V2 =¢U,, Way IA 4a. (10) 
7.9. ||O || = 1. By the definition of Q, this means that for all Z € V? we 


must have 
IZEY||< we |Z =<U, WY? |), 


but this is obvious from (10). 


7.10. || R || = 1. By the definition of R, for any U, W', W* © V® we must 
prove the inequality 


KU, WIPE YILAIKU, W?>? € YI < |W! = Ww?) 
Using (10), we rewrite the left-hand side in the form 
BAe U,| AW! = Wall Ada ANU = Upll AW? = Well A a¢- 


Since ||U = U,|| A||U = Ugll < || U, = Ugl| and ||U, = Ugll Aga A ag =9 
for a # B (see the definition of a,), it follows that in this sum we need only 
consider the terms with a = £. But such a term is < ||W! = W,|| A ||W? = 
W,|| < || W' = W?\|, as required. 


7.11. ||S|| = 1. This is equivalent to the inequality 
KU, WE ¥| <|UEX|. 
But, by (10), the left-hand side equals 


VU = UIA W = Wall Nae SVU = UAW = Wall AI Ue € ¥| 


<VU= UN AU, EX =U EX). 
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7.12. || T|| = 1. We must prove that for any U € V? 
JU EXIAIUAS| < Ie UIA\KU, We Y\. (1) 


We first show that it suffices to prove (11) for U € D(X), ie., for all U,, 
a € I. In fact, suppose (11) holds for all U,. Then for U € V? we have 


JU EX = VU = UN AIU. E Xl 
|V4SI= Vo We vi, 
u,Eev? 


and hence 
IV EX|IAIUFO| = ae |U, E UIA U = Ui AI Ua € * Ih 
< |U € Us AYU = UA |U, EX] (by (IID) in §5) 
= VU. # SILA = Ugll A{| Ua = * || 
= MIMS Ux AIK Ua W>? E YIIAU= Ul (by (11) for U,) 
<Viwe UI|AIKU, W>? € YI. 
(Here we used the fact that 
XU, W>? € YAU = UI 
7 VlU. = Us| A||W= Wall Nag Al U = Ul 
s ViilU= Upgll AW = Wall Aap 
=|(U, W>? € YI.) 
Thus, it remains to prove (11) for U,, a € J. Now 
|U, # Dll = ||Aw(w € UV, )|| = V/ |W e U,|| = |W. © Vall 
Hence (11) can be rewritten 
|U, € XI AI] W, © Uy] < Viwe Uz || A<Ua W>? € Y||- (12) 


We prove this by induction on a. (12) is obvious for a = 0, since the term 
on the right with W = W, coincides with the left-hand side. Suppose (12) 
holds for B < a. 

By the definition of a,, we have 


We EX I= 4.V( Up © XIALIUg = Yall) 
If we substitute this formula in the left-hand side of (12), we find that we 
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must prove two inequalities: 
a, /\ ||Wa € U,|| < V |W E UN AIKU, WY? € YL, (13) 
Up E XIAN Ug = UnllA || Wa © Vall 
< Vv |KU,, WE YI AIWEU,|, forallB<a. (14) 


The inequality (13) is obvious if we look at the term on the right with 
W=W,. The inequality (14) reduces to the induction assumption as 
follows. The left-hand side of (14) is: 


< 10g EX Ai Ue = Gall All Ma & Vall 
<0, E XA Us = GailAll Me & Vall 


by the definition of W,. Further, using the induction assumption and 
extensionality, we have 


Up = UllAWUs © XA Mp © Yall 
<VIIWE Upll All Up, W>? © VILAIUp = Usll 


< Vv |W EU AKU WY? & YI, 


which completes the verification of the axiom of choice. oO 


8 The Continuum Hypothesis is “false” for suitable B 


8.1. We recall (Lemma 7.2(a)) that the set ¥ € V® collects the sets {X;} 
with probabilities 4, € B(i € 1) if || Y © X|| = V; || ¥ = Xl Aq; for all Y. 
Using this definition, we can introduce a useful canonical mapping t}> ¢ 
from the von Neumann universe V to the universe V2. Let @ = @ (recall 
that || Y © @|| =0 for all Y), and, if # has already been defined for all 
s €V,, then for t € V,,, we let f collect all the § for s € t with probabilities 
1. In other words, for any Y € V? 


Iyedi= Viy=s. (1) 


(Here the collecting set ¢ is not uniquely defined, i.e., it is only defined 
modulo equivalence, so that, strictly speaking, we should also specify the 
rank of t, for example by saying that it equals the rank of ¢. This is not 
essential for us, however, since we shall only be interested in the truth 
functions, which do not change if we replace an object by an equivalent 
object.) 

We now formulate some additional conditions (besides completeness) 
which must be imposed on the Boolean algebra B for the purposes of this 
section. Recall that w, is the first infinite ordinal, w, is the first ordinal 
having cardinality > wo, and w, is the first ordinal having cardinality > w,. 
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8.2. Conditions on B. 

(a) The countable chain condition, which, we recall, says that if we have 
a family of elements {a,}, i € J, such that a, #0 and a; \a,=0 for i#/, 
then / is at most a countable set. 

(b) There exists a family of elements b(n, a) € B, indexed by the set 
Wo X 2, With the following property: if Z(a) collects the elements A, 
n€wp, with probabilities b(n, a), then ||Z(a)= Z(8)||=0 for a# B, 
a, B Ee. 

The second condition has the following intuitive meaning. It is easy to 
see that ||Z(a) C @|| = 1. In fact, this equality is equivalent to 
IWx(x € Z(a)>x EG )|| = 1, Le. to 


WY EV , = ||X EZ(a)|| < ||X Eapll, 


and this is obvious from (1), since 4, collects the # with probabilities 1, and 
Z (a) collects the 7 with probabilities b(n, «) < 1. 

Thus, condition (b) means that we can find w, distinct subsets Z (a) C 
@, So that, in the naive sense, we have card ?(@p) > w,. This is precisely 
the negation of the Continuum Hypothesis. Of course, it is still necessary 
to show that this intuitive idea can be made into a proof. 


8.3. The existence of B with the required properties. We could use measur- 
able sets, as in §3. However, in order to vary our approach, and to prepare 
for §9, we give another construction. Let {0, 1} be the discrete two-point 
space, let J = wy X w, and let S = {0, Ly? be the space of vectors whose 
coordinates are indexed by / and take the values 0 or 1. We introduce the 
direct product topology on S. It has a standard basis of open sets 
consisting of all vectors whose coordinates indexed by a finite subset ¢ C J 
are fixed. 
If a C S, we set 


a’ = the complement of the closure of a in S, 


and we set a” =(a’)’. Sets a C S with a” = a are called regular open sets in 
S. 


8.4. Theorem. Let 


B={acS|a” =a}, 
a/N\b=anb, 
a\/b=(aub)”. 
Then B with the operations /\, \/, and’ is a complete Boolean algebra 


with the countable chain condition, and \/, a, =(U , a;)” for any family 
of a, € B. 


We omit the proof (see J. B. Rosser, Simplified Independence Proofs, 
Academic Press, New York, 1969, Chapter 2). 
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8.5. Lemma. Under the conditions in 8.4, let 
b(n, a) = the set of vectors with | in the (n, a) place, 
and let Z(a) be defined as in 8.2(b). Then 
IZ (a)=Z(B)|| =0, fora # B. 


Proor. By formula (5) in §4, we have 
IZ (a) = Z(B)Il = A (b(n, a) V (n), B)A(0(n, a) A(n, BY’). 


The right side can only become larger if we replace (. by M and \V/ by U; 
here the primes ’ coincide with the ordinary complements. If we had 
||Z(a) = Z(B)|| 40, then there would exist an element X in the standard 
basis of the topology (see the beginning of 8.3) which is contained in 


() (b(n, a) 9 b(n, BY) U (BCA, a) 9 b(n, BY). 


But this intersection consists of all vectors having the same (n, a)-coordi- 
nate and (n, 8)-coordinate for all n, while all coordinates except for a 
finite number range freely in any element X of the standard basis of the 


topology. im 
8.6. Formulation of the negation of the Continuum Hypothesis. We shall 
prove that the following is “true”: 
Vx((“x is an ordinal” “x is not finite’ \Wy(y © x>“y 
CH: _ is finite”))=>3w(“there is no function from x onto all of 
w” “there is no function from w onto P(x)”)). 


Here: 
x is finite: Vy(y Cx Ay # x>‘“there is no function from y onto all of x’). 


We leave the translation of the other abbreviated notation to the reader. 

The premise in —CH says that “x is the first infinite ordinal,” and the 
conclusion says that “w is a set having cardinality intermediate between 
that of x and that of 9(x).” We shall abbreviate CH as follows: 


Wx( P(x) = 3w(Q, (x, ) A Q2(x, #))). (2) 


8.7. Reduction Lemma. Let P(x) and Q(x) be two formulas in the Zermelo 
—Fraenkel language having one free variable x and satisfying the proper- 
ties: 

The formula A!x P(x) is deducible from the axioms, and 
Xo € V? is an element such that ||P(Xq)|| = 1. 
Then ||P(X)|| = ||X = Xol| for all X, and if ||Q(Xo)|| = 1, it follows that 
IWx(P (x)= Q(x))|| = 1. 


140 


8 The Continuum Hypothesis is “false” for suitable B 


Proor. We first note that |/4x P(x)|| > ||5! x P(x)|| =1, since all the 
axioms are “true” in V?, and the rules of deduction preserve “truth.” It 
hence follows from Lemma 7.4 that there exists an object X,€ V? with 
P(X) = 1. . . 

Further, P(x) \ P(y)=>x =y is also deducible, so that, if we apply this 
with X in place of x and X, in place of y, we find that 

P(X || < |X = Xoll. (3) 

But ||P(X)|| A |X = Xoll = P(X) A |X = Xoll = ||X = Xoll. Hence the in- 
equality in (3) may be replaced by equality. 

Finally, we suppose that ||Q(X,)|| = 1. Then, by what was just proved, 

P(X I =O Xo dIAIX = Xoll = OCI AIX = Xol 
=O(XI AIP), 

so that ||P(X)|| < ||Q(X)||, and ||Wx(P (x)= Q(x))|| = 1. O 


This lemma can be applied to —CH in the form (2), since the formula 
a! x P(x), where P(x) is the premise “x is the first infinite ordinal,” is 
deducible from the axioms. We shall not give this formal deduction, and 
shall consider the uniqueness of w) to be common knowledge. Now, by 
Lemma 8.7, to verify —1CH it suffices to show the following facts: 


8.8. || P (a )|| = 1. (In other words, & plays the role of X, in our situation.) 
8.9. ||O\(@, ®,)|| = 1. 


8.10. ||Q,(@p, @,)|| = 1. (This then implies that ||4w(Q(@, w) A Q2(Gp, ))|| 
= 1, and completes the verification of the conditions of the lemma.) 
8.8. is verified almost mechanically, and we leave it as an exercise. 


8.11. VERIFICATION OF 8.9. We must show that, if B satisfies the countable 
chain condition, then 


|| a function from @, onto all of &,|| = 0. 


The proof that follows carries over word for word to the more general case, 
when instead of wy and w, we take any pair s,¢@V such that card 
s <card ¢ and card s is infinite. 

We suppose that 


0O4a= ACF is a function A\Wy(» E &, > Ax(x Ea) A<x, y> Ef)))Il, 


and we show that this leads to a contradiction. There must exist an 
F © V® such that 


a< ||Fis a function|| \( /\ toe ). 


For every a €w, we consider the term in /\,--- corresponding to 
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Y = & and use the fact that ||@ € &,|| = 1. We obtain: 
a < ||Fis a function| A( \/ |X Gall A l|CX, &? © Fil). (4) 
Xx 


By (1), we have 


|X E Goll A |X, &>? € Fl vo X= AI AIKX, 4)? © FI 


VV IIX=AILAIKA, >? € Fil 
N<wWo 


so that, if we sum first over X and then over n, we may write (4) in the 
form 


a < ||Fis a function| A VV ||Ka, 8 Fl). 
n<wo 


Hence, for every a < w, there is an n(a) < wo such that 
\|F is a function|| A ||(n(&), &>? € Fl #0. 
Then there exists an m) and a subset ¢ Cw, of cardinality w, such that 
Oa, =||F is a function|| A ||<Ao, &? € Fil, for alla € ¢. 


It remains to show that a,/\ag=0 for a#£, which contradicts the 
countable chain condition on B. Now by the definition of a function 


a, /\ dg = ||F is a function|] A ||<fig, &® € Fil Allo BY? € Fill < \@ = Bll, 


so that it suffices to show that a # 8 implies ||@ = f|| =0. 7 
In fact, if, say, y € a but y ¢ B, then the formula (5) in §4 for ||a = || 


has a zero term, namely ||7 € &||"\V ||7 © Bl. (To check that || 7 € Bil = Oif 
y ¢ B we have to know that ||7 = 6|| =0 if y #4, but we only have to 
know this for y and 6 of lower rank than a and 8, so that the detailed 
proof uses induction on the rank.) O 


8.12. VERIFICATION OF 8.10. We must show that 
|| a function from &, onto P(®p)\| = 9, 


that is, that 
\|Ag( g is a function \Wz(z C&)>4y(y EO, A<y, DE g)))Il =0. 


Suppose that for some G € V? we have 
O#a= ||G is a function|| A( /\ ork ). 


For every a < w, we consider the term corresponding to Z = Z (a) (see the 
definition in 8.2 and 8.5), and we use the fact that 


O#a< [|G is a function|| A( VY EaIAIKY, Z (a)? Gi). (5) 
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By (1), we have 
IY EN AIKY, Z(a)>? EG = A Y= BIAIKY, Z(a)>? € G|| 
@ 


= VY, Y= BIAIKB Z(@)>? € GI. 


Summing first over Y, we rewrite (5) in the form 


0#a< |G isa function|| A NY. I< B, Z(a)>? € GI. 
ew] 


Hence, for every a < w, there is a B(a) < w, such that 
0#a, =||G is a function|| A |< B(a), Z(a)>? € GI]. 
Then there exists a By < w, and a subset $ C w, of cardinality w, such that 
04a, = ||G is a function|| A ||< By, Z(a)>® € G||,_ for alla € ¢. 


As in 8.11, we obtain a contradiction to the countable chain condition if 
we show that a, /\ag = 0 for a # B. But this follows from 


a, \ dg < ||Z(a) = Z(B)|| =0 
by Lemma 8.5. oO 


9 Forcing 


9.1. By choosing the Boolean algebra B in various ways, one can use the 
corresponding models V? to show that many different assertions P are 
consistent with the Zermelo—Fraenkel axioms. But each choice of B for a 
given P such that ||P|| = 1 in V? presents a separate problem. 

There is another interpretation of this method which is closer to Cohen’s 
original idea. From this point of view we start not with a universe V and a 
Boolean algebra B, but with an (often countable) transitive model M and 
an ordered set C of “forcing conditions.” It is usually more obvious how to 
choose a suitable C than how to choose a suitable B for proving that a 
given proposition P is consistent. One might say that B embodies the 
“physical meaning” of the problem, while C expresses its “logical mean- 
ing.” Anyway, it is not difficult to go from one version to the other, and in 
either case it takes about the same amount of work to verify the “truth” of 
the axioms. 

In this section we discuss the second version, using forcing, with most of 
the proofs omitted. The details can be found in Cohen’s original article, 
and also in Jech’s book Lectures in Set Theory with Particular Emphasis on 
the Method of Forcing, Springer-Verlag Lecture Notes in Mathematics 217, 
1971, and in J. R. Shoenfield’s article, “Unramified forcing,” Proc. Symp. 
in Pure Math., vol. 13, 1, 357-381 (American Math. Soc., Providence, 
1972). 


143 


III The continuum problem and forcing 


9.2. Before introducing the general concept of forcing, we consider a 
special case which arises in a typical problem. 

Let X and Y be two sets, for example P(wo) and w,. We consider the 
proposition P : “card X > card Y,” which in this special case is the 
negation of the CH. One possible approach to constructing a model (in the 
usual rather than Boolean sense) of L,Set in which P is true, is as follows. 

We take our original countable transitive model M of set theory (i.e., of 
the special axioms of L,Set), which was shown to exist in §7 of Chapter I. 
Let X,, and Y,, be the “representatives” of X and Y in M. (This means 
that if, say, X is defined by the formula 3! x P(x), then Xy = x*, where & 
is a point of the interpretation class for which |P(x)|,,(€) = 1; see §7 of 
Chapter II.) We assume that X,, is infinite and Y,, is nonempty. Then 
“from an external point of view” X,, is countable and Yy is at most 
countable, so there automatically exists a function F which maps X,, onto 
all of Y,,. A natural idea would be to add (the graph of) F to M, 1e., to 
consider the least countable model N of the axioms which contains M and 
F. Then N has a map from Xy, onto Yy, but it is very likely that X, # X,, 
and Y, # Y,,. What we need in N is a map from X, onto Yy. 

As we have shown when discussing Skolem’s paradox in Chapter II, at 
least for certain pairs (such as X = wo, Y = P(wo)), we cannot obtain a 
map from X onto Y in this way. In those cases when we can construct such 
a map, we must choose F very carefully. Cohen’s idea was that F, rather 
than being chosen so as to satisfy some conditions, should be chosen so as 
to avoid reflecting any specific properties of M, i.e., F should be “generic.” 
We shall formulate this more precisely. 

It turns out to be important to start not by choosing F directly, but by 
choosing the set 


G = {restrictions of F to finite subsets of X4, }. 


Clearly, F is uniquely determined from G: F= U 2ec8 (recall that a 
function is the same as its graph). Hence F is contained in any model 
which contains G. But now we must give an axiomatic characterization of 
the suitable G without using F explicitly. Here are the properties which G 
must satisfy: 


9.3. 

(a) G CC, where C is the set of maps from finite subsets of Xy, to Yy. 

It is important that C € M, because the formula in L,Set which defines 
C is (M, V)-absolute. We need this remark in order to motivate the general 
definitions later. 

(b) GEG; if pEG and gE C, where gCp, then g EG; for any 
Pi, P2 © G there is ap © G such that p Dp, Upp. 

Suppose we have chosen such a set G of maps from finite subsets of Xj, 
to Yy. Then U,-¢g is also a map from some subset of Xy to Yy. In 
order for this map to be defined on all of X,, and to be surjective, it is 
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necessary and sufficient for the following additional conditions to hold: 


VZEXy, Gn{peCl|pis defined atZ}+@, 
VZEYy, GN{qe Clq takes the valueZ}4+O. 


We call a subset D C C dense in C if for all p € C there is ag € D with 
P Gq. The set of maps p defined at Z and the set of maps g taking the 
value Z are dense, and, moreover, are elements of M by the same 
consideration of (M, V)-absoluteness. Hence the two requirements at the 
end of the last paragraph are included in the last condition, that G be 
generic: 

(c) GM W#@ for all dense subsets D C C which are elements of M. 

Although it is not yet evident, it is precisely the condition that G be 
generic which ensures that the properties of the sets X,, and Y,, will be 
preserved as much as possible after we add G to the model. 

We now define the general concept of “forcing conditions.” 


9.4. Forcing conditions. These are the elements in any partially ordered set 
(C, <) which has a maximal element 1. Usually C and < lie in the 
original model M. 

A set G is called generic over M (relative to C) if the following 
conditions hold: 


(a) GCC; 

(b) 1 € G; if p € G and q € C, where q > p, then g € G; for any p), p» E G 
there is a p € G such that p < p, andp < p,; 

(c) GM D#¥@ for all dense subsets D C C with D € M (D is dense if, for 
all p € C, there is a g © D with q < p). 


If the reader compares this definition with the special case in 9.3, he or 
she will notice that we have replaced C by > and @ by I. This is in 
keeping with Cohen’s original point of view, according to which p > q if, 
when p is considered as a “condition” imposed, say, on F, more F’s satisfy 
p than q. (Each p fixes the restriction of F to some finite subset of Xy,.) 


9.5. The existence of generic sets. Let M and C be fixed. If MN P(C) is 
countable, then for every p © C there exists a generic set G containing p. 

In fact, we index the elements of MM P(X) as X,, X>, X3,..., and 
then set 


Pn if p, < qgforallg€ X,; 


a ee such thatg<p,, otherwise. 


Finally, we set G = {q € C|An(p, < q)}. 

Conditions (a) and (b) for G to be generic are trivial to verify. Condition 
(c) follows because, if D € M and D is dense, then there exist n and q for 
which D= X,, q © X,, and g < p,, so thatp,,,; ©DNG. 
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9.6. The connection with Boolean models. As mentioned before, we have 
considerable freedom in our choice of the set C of forcing conditions and 
the generic subset G C C. Exactly how one “forces” a given proposition P 
was explained briefly in 9.2. We now show how to construct an axiom 
model M[G] which contains M and G, once C and G have already been 
chosen. 

The article by Shoenfield gives a direct construction, but we shall make 
use of an analogy with V%, as in Jech’s presentation. In this approach 
M[G] is constructed in three basic steps: 


(a) Corresponding to the set C we construct a canonical complete Boolean 
algebra B. 

(b) We construct a Boolean universe M? over B which is “relativized” by 
means of M. 

(c) We construct a canonical maximal ideal /,, C B determined by G and 
the “fibre” of the universe M” over the quotient algebra B/I, = 
{0, 1}. It is this fibre which will be the model M[G]. 


We now discuss these steps separately and in more detail. 


9.7. Ordered sets and Boolean algebras. Every Boolean algebra B has a 
canonical partial ordering: a < b if a/b =a. All elements of the structure 
of B are uniquely determined by this partial ordering. The induced 
ordering on B — {0} is separable. By definition, this means that, if a, b 40 
and a « b, then there exists c < a,c #0 such that there is no d #0 for 
which d < b and d<c. (It suffices to take c= a/b’.) Such b and c are 
called disjoint. 

Now let C be a fixed partially ordered set. We consider the class of 
(nonstrictly) order-preserving maps of C into different complete Boolean 
algebras B such that 0 is not contained in the image. 


9.8. Proposition. Jn this class of maps there exists a unique universal map 
e: CB with the following properties: 


(a) e(c) is the maximal separable ordered quotient set of C such that 
Cy, Cy © C are disjoint e(c,), e(c,) © B are disjoint; 
(b) e(c) is dense in B — {0}. 


B can be realized as the algebra of regular open sets in the space C with 
the topology defined by the basis U.= {x EC|x <c},c EC. 
Now we can indicate how /, is constructed from the generic subset 
GCC: 
G, = {b€ B\3p €G, e(p) < db}, 
Ig = B\G,. 
It is not hard to prove that J, is a maximal ideal in B, i.e., the kernel of a 
Boolean homomorphism B — {0, 1}. The set G; is precisely the preimage of 
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1 under this homomorphism. Since G is generic in C, we have the 
following property of G,: for any subset A C B such that \/,.,a= 1 and 
a, /\ a, = 0 whenever a, # a € A, there exists a unique element aE AQ G,. 


9.9. The universe M®. This universe is constructed from M and B in 
exactly the same way as V® was constructed from V and B, with one 
essential difference: all constructions are relativized with respect to M. This 
means that, instead of B, we take the algebra B,, which “represents” B in 
M (see 9.2); only ordinals a € M are used in the construction of M3, and 
so on. A rigorous presentation of these constructions would require much 
more formalization using the expressive means in L,Set than seems desir- 
able in this section. In such a presentation both the general plan and the 
details of the work would remain essentially the same as before. 

The basic result of these constructions is that to every closed formula P 
in L,Set with constants in M corresponds a Boolean truth value ||P|| © 
B,,. Here the value | corresponds to the axioms, and deductions preserve 
“truth.” 

The next step cuts down the size of M%, again giving a transitive 
standard submodel. 


9.10. Construction of M[G]. For brevity, we shall write B instead of By, 
and so on. The construction essentially consists in going from “random” 
sets X, Y € M® to “determined” sets X, Y, where we say that X € Y if the 
truth value ||X € Y|| goes to 1 under the homomorphism B> B/I, = 
{0, 1}, ie, if ||X¥ © Y|| EG, (see 9.8). More precisely, we inductively 
define 
i(O) = ©, 

and let M[G] denote the image of the map i: M?— V. This notation is 
justified by the following result. Suppose that C and < belong to M and 
that the subset G C C is generic. 


9.11 Proposition. M[G] is a model for the Zermelo—Fraenkel axioms which 
contains M and G. If M is countable, then M[G] is the least such model. 


M[G] contains M for the following reason. If we let X }> X denote the 
map M— M® which is constructed as in 8.1, then it is easy to show that 


X=X. 

M[G] contains G because G = G’, where G’ is the object in M? which 
collects all the 6, 6 € B, with probabilities 6. 

M[G] is an axiom model basically because M? is a Boolean axiom 
model. However, here we use in an essential way the assumption that G is 
generic. (Shoenfield verifies this result directly, without using M?.) 


9.12. EXAMPLE. We return to the assertion “card ? (wo) > w,” in 9.2. By the 
above discussion, to prove that it is consistent with the axioms we choose a 
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countable model M and then set: 


C = {maps of finite subsets of P(wo) to v2}, 
G C C=a generic subset of C. 


If we consider a map from a subset of ?(w9) to w, as a function from 
Wy Xw, to {0,1}, and if, instead of “relative” constructions in M, we 
consider “absolute” constructions in V, then the Boolean algebra B that 
we obtain from C turns out to be the same algebra which was constructed 
in 8.3 and 8.4. This explains the appearance of B. The ideal J, did not play 
any role in §8 because we were not trying to construct a standard model. 


9.13. We conclude with a very general theorem of Easton, which shows 
how little we understand the behavior of the function 2* (k a cardinal). 

Let a be a limit ordinal. Its cofinality cf (a) is the least ordinal 8 such 
that a is the union of £ ordinals less than a. An infinite cardinal k is called 
regular if cf (k)=k and is called singular if cf (k)<k. Konig (1905) 
proved that cf (2*) > k. 


9.14. Theorem (Easton, 1965). Let F be any (nonstrictly) monotonic function 
on a subclass of the regular cardinals which takes values in the class of 
cardinals and which satisfies: cf (8 p(%)) > 8x. Then the assertion “W regu- 
lar k Edom F, 2** = Xp(q" does not contradict the Zermelo- Fraenkel 
axioms. 


If the domain of F is a set, Easton’s theorem can be obtained using a 
model of the form M[G], where M is a model in which the generalized 
continuum hypothesis holds (Gédel proved that such an M exists; see the 
next chapter). If the domain of F is a class (for example, the class of all 
regular cardinals), the concept of forcing must be generalized to the case 
when C is a class. 

The question of the behavior of 2* for singular k has not yet been 
“solved” in this weak sense. 
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CHAPTER IV 


The continuum problem and 
constructible sets 


1 Gédel’s constructible universe 


1.1. In this section we introduce the subclass L Cc V—“Gédel’s construct- 
ible universe’—and establish its fundamental properties. Perhaps the 
shortest description of Z is that it is the smallest transitive model of the 
axioms of L,Set which contains all the ordinals. But the working definition 
of L, from which the name “constructible universe” is derived, is rather 
different. 

We consider the following operations F,..., Fg on sets: 


Fi (Xx, Y)= {X, an 

F,(X, Y)=XVY, 

F,(X, Y)=X XY, 

F,(X) = {U|AW(<U, W> © X)} =dom X 
F,(X) = {<U, W)|U, WEX; UEW} 
Fe(X) = {(U,, U,, U3>|( U2, U3, UY E X }, 
F,(X) = {<U,, U,, U3>|( U3, Uz, UY E X }, 
Fy (X) = {<Uj, Uz, Us>|(U,, Us, U,»€ xX}. 


We say that a set (or class) Y is closed with respect to an operation F of 
degree r if we have F(Z,,..., Z,)€ Y for all Z,,..., Z, © Y such that 


F(Z,,...,Z,) is defined. For every ¥ © V we let $(X) denote the 
smallest set Y > xX which is closed with respect to the operations 
F,,..., Fg. It will later be shown (subsection 1.4) that $(X) actually is a 


set. The following construction is analogous to the definition of V. 
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1.2. Definition. 
Lo= OQ; 
oe = P(L,) al S(L, U (hes 


L,= VU Lg, if a is a limit ordinal; 
B<a 
L=vul,. 


The elements of L are called constructible sets. 


The operations F,,..., F, and simple combinations of them, together 
with the transfinite recursion in the definition of L, exhaust the arsenal of 
primitive set theoretic constructions used in mathematics. This can be seen 
by looking at Bourbaki’s “compendium of the results of set theory,” upon 
which all subsequent material in their voluminous treatise on the founda- 
tions of mathematics is based. The only way we could possibly (but not 
necessarily) leave L would be to apply the axiom of choice. This could 
happen provided that L is strictly less than V; but, as mentioned before, 
this question is undecidable in the Zermelo—Fraenkel axiom system (see 
also 5.16 below). Gédel was of the opinion that L does not exhaust V, as 
are most specialists who accept the semantics of L,Set. 

Of course, the constructibility of the elements of L should not be 
understood in a finitistic sense. The sets we construct at the (a + 1)th stage 
are only the subsets of L, which are obtained from the elements of the sets 
L, and {L,} using the explicit constructions F;, But when we consider all 
the ordinals indexing the stages, we see that L is hopelessly infinite. 
Nevertheless, in many respects the construction of L is simpler than that of 
V, and L seems to provide a convenient framework for mathematics. 

We now list some properties of L which follow easily from the defini- 
tions. The specific nature of the operations F, plays a very secondary role 
in these properties. 


1.3. L, = V, for all n < wo. This is true for Lo. Suppose it is true for L,,. It is 
clear from the definition that L, E L,,, and {X} EL,,, for all X € L,. 
Moreover, any subset of L, can be represented as a finite difference 
(2+ (LEX )\ EX })\ > \{X,} where the X,;EL, are the ele- 
ments not in the given subset. 


1.4. card L, = card « for all infinite ordinals a. In fact, for X € V let 


3 8 
OX)=XU UF (XxY)u U F(X), 


i=! j=4 


where F”(X) = {F(Y)|Y¥ € X} is the image of F restricted to the elements 
of X. Then §(X) = U,~_,®"(X). It is hence clear that card §(X) = card X 
if X is infinite. We now prove the assertion 1.4 by induction on a. 
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Obviously card L, > card a. Suppose that a@ is the least infinite ordinal for 
which card L, > card a. By 1.3, wexhave a > wo. a cannot be a limit 
ordinal, or we would have card L, = Z,-, card B = card a. But the case 
a= 8+1 is also impossible, since in that case card L, < card I (Le U 
{Lg}) = card (Lg U {Lg}) = card B = card a. O 


In particular, the result 1.4 shows that, beginning with w,+1, the 


inclusion L, C V, becomes a strict inequality, since card V,, ,, = 2%. Of 


course, this does not in principle exclude the possibility that Va 48 > a, 
Lg 2 Vis but it seems that there is no such £ even for a = Wo + 1. 


1.5. L is transitive: YEX EL,>Y € Ly, i.e. L, C Ly41. See subsection 
13 of the Appendix to Chapter II; the proof is no different for L. 


1.6. L is a big class: by definition, this means that for any X EV with 
X CL there exists a Y © L such that X CY. 

On L we consider the function ¢(x) which is equal to the least a for 
which x € L,. Let X © V, X C L. We consider the map ¢ restricted to X. 
By the replacement axiom, the values of @ form some set Y. The elements 
of Y are ordinals. Let 8 = U Y. Then for each x € X we have 8 > $(x), so 
that X C Ly. 


Effective numbering of L by ordinals. 
We order pairs of ordinals <a, B> by the relation 


(a1, By> < (a), By> either max(a,, 8) < max(a, B), 
or else these maximums are equal and a, < ay, 
or else these maximums are equal and a, = a, 
and 8, < f). 
Further, we order triples (i, a, B>, where i=0,..., 8 by the relation 
Cis Os BLY < Cay, 2, By> either (a,, By> < <a, Br, 
or else (a,, B,> = a5, B> and i, < i. 
We call these triples important. 
1.7. Lemma. The class of important triples is well-ordered by the relation <. 


In addition, the following assertions hold: 
(a) The next triple after <i, a, B> has the form: 


it l,a,B>, ifi<7; 
C,a+1,B>, ifi=8andat+1< B; 
O,a+1,0, ifi=8andat+ l= ZB; 
O,a,8+1>, ifi=8anda> Bp; 
0,0,8 +), ifi=8anda=B. 
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(b) Limit triples have the form: 
<O,a, B>, ifa+1< Banda is a limit ordinal: 
this is the limit of <i, y, BY, y <a 
<0, a, OD, if a is a limit ordinal: this is the limit of Ci, y, «>, y <a; 
<0, a, B>, ifa > Band B is a limit ordinal: 
this is the limit of <i, a, y>, y < B; 
<0, 0, B>, if B is a limit ordinal: this is the limit of <i, a, y>, a < B, y < B. 
Proor. The proof follows immediately from the definitions. We shall 


illustrate this by showing explicitly how to find the least triple in any 
nonempty class C of triples. We set 


y = min{max(a, B)|<i, a, B> EC}; 
C= {<i, a, BY € Clmax(a, B) = y}. 


If C, does not contain any triples of the form <i, a, y>, then let By be the 
minimum of the third coordinates of triples in C,, and let ip be the least i 
such that <i, y, By> € C,. Then ig, y, Bo> is the least triple in C. Other- 
wise, let C; consist of triples of the form <i, a, y>€EC,, let ap be the 
minimum of the second coordinates in Cj, and let ip be the least i such that 
Ci, dg, YY E C,. Then Cig, a, y> is the least triple in C. 


The exact form of assertions (a) and (b) will be needed only in §5. The 
lemma implies that there exists a unique order-preserving isomorphism 


K : {ordinals} = {important triples }. 
Using this isomorphism, we recursively define a numbering mapping 
N : {ordinals} = L. 


Since we have a < y and B<y if y>0,i>0, and K(y)=<i, a, B>, we 
may set: 


Les for i =0; 
N(y) =} F(N(a), N(B)), fori =1, 2, 3; 
F,(N(a)), fori =4, 5, 6, 7, 8. 


1.8. Lemma. 
(a) The mapping N is correctly defined. 
(b) The image of N coincides with all of L. 


PROOF. 

(a) To verify correctness, it suffices to show that {L,} © L and that the 
class L is closed with respect to the operations F;. In fact, then induction 
on y shows that N(y)€ L if N(@) © L for alla < y. 
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Let X,Y EL,. Since L is transitive (see 1.5), we easily find that 
F(X, Y), F)(X, Y), and F,(X) belong to ?(L,), and hence to L,,,. For 
example, 


VER (X)S4IWU, WEL, >{U} EL, SUE L,. 


Further, X X Y is a subset of the ordered pairs of elements in L,. We 
showed that the unordered pairs lie in L, , ,, so that the ordered pairs lie in 
Ly+2 and finally ¥ X YEL,,, and F(X)E€ Li 44. Analogously, 
elements of F,(X) for i = 6, 7, 8 are ordered triples of elements in L,, 
that F(X) © Ly +6. 

(b) Let Z be the image of N. We show by induction on a that L, C Z. If 
a is a limit ordinal and L, Cc Z for each y < a then also L, = 0 eee ne 
Z. Suppose a= 8+ 1 and Ll, Cc Z, and let X € Ly. Then x E@"(LpU 
{L,}) and we show X € Z by induction on n. 

(b,) n =0. Then either ¥ € Lz so X € Z by the induction hypothesis, or 
else ¥ = L,, in which case X¥ = N(y) for y such that K(y) = <0, B, 0). 

(b.) n>0. Let X= F,(Y, Z), (= 1,2, 3; Y,Z E®"" (LeU {Lg}). By 
the induction hypothesis Y= N(y,) and Z=MN(y,) for some ordinals 
Y1) Y2. Therefore X = N(y) where K(y) = <i, y;, ¥2>- 

Let X= F,(Y), (=4,...,8; Y E@"" (Lg U {Lp}). The verification is 
analogous. 

The lemma is proved. oO 


In §3 the numbering N will allow us to prove that a strong form of the 
axiom of choice is L-true. The fundamental step in the proof is to choose 
the element with the least N-number in each constructible set. 


2 Definability and absoluteness 


2.1. Let M C V be a non-empty class, and let P be a formula in L,Set. As 
in §7 of Chapter II, we shall consider the truth values | P| ,/() for. fEM, 
where we take the standard interpetation of L,Set in V restricted to M. We 
then say that the formula P is M-true if |P|,, = 1 for all & 

We shall also consider formulas “with constants in M,”’ where we 
assume that the language L,Set has been extended so that its alphabet 
includes names for all the elements of M. We shall designate these 
elements by the same letters as in the metalanguage (X, Y,... for sets; 
a, B,... for ordinals, etc.), which we hope will not lead to confusion. We 
extend the definition of |P|,,(€) to formulas with constants in M in the 
obvious way: we take X* = X for any constant X and any point £. 


2.2. Definition. Let X¥, € M,i=1,...,n. Sets of the form 
{<yi,... sDIE EM, yf EX, fori=,.. Jn; [Plas() = 1} 
CX,X--: XX, 
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are called M-definable sets. Here P runs through all formulas with 
constants in M and free variables in the set {y),...,y,}. 


If P(yy,.--3¥_s Zj,--+» Z,»,) is such a formula (where the notation 
shows the constants and free variables) and if y? = Y,, we shall often write 
“P(Y,,..+3 Yas Zj, +++ > Zp) is M-true” instead of |P| (6) = 1. 

The next proposition, which, in particular, is applicable to L, is a basic 
instrument for proving many assertions about L. 


2.3. Proposition. Let M Cc V be a transitive big class (see 1.6) which is closed 
with respect to the operations F,,..., F. Then all M-definable sets are 
elements of M. 


Proor. The proof is by induction on the number of connectives and 
quantifiers in the defining formula P. 

(a) P(y,,---.¥,3 Zp ---» Z,) is an atomic formula. It can have one of 
eight possible forms: the predicate can be either € or =, and on each side 
of © or = we can have either a constant or a variable. But all of these 
cases reduce to two: y,€y, and y, € Z,, if we are willing to make the 
formula a littlke more complicated. For example, since M is transitive, we 
have 


“y = Z” defines the same set as Vz(z © Zz Ey), 
“Z Ey” defines the same set as dz(z = Z/\z Ey), 


and so on. We therefore analyze these two basic cases. 

(a,) yj, EZ. We have ZM X,=Z\(Z\X,) EM since Z and X, € M, 
and M is F,-closed; and we have X¥, X +--+ XX;_);X ZN X,;X +--+ XX, 
€ M, since M is F;-closed. This last set is M-definable by the formula 
y; © Z, because M is transitive. 

(a2) ¥; Ey; We use induction on n > 3. Let 


YoY tas YO pe. forks Geren SY). 
The case <i, j>=<n—1, n>. Let X,_, UX, CX © M. Then 
y= 
x Fy (Fs(X) X(X,X + XK Xn) (Xn X Xn) K(X, + XX, -2))- 
The case <i, j> = <n, n— >. Again let X,_, U X, C X © M. Then 
Y= 
x Fy (F5(X) X (Xy XK Xp) (Xn XX) X (KK + K X72). 


The case n € {i,j}. By the induction assumption, the set Y’ which is 
M-defined by the formula y,€y, in X,X--- XX,_, lies in M. But 
Y=Y'xX,. 

The case n—1€ {i,j}. Let Y’ be M-defined by the formula y, € y, in 
X,X +++ x X,_)%X,. Then Y= F,(¥' X X,_,). 
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The case n =2 reduces to the case n = 3 by taking the direct product 
with {@} and projecting. The projection of X¥,x--- XX, onto X, is 
Fye- + > oF, (a — I times). 

(b) Connectives. /\ corresponds to intersection, and — corresponds to 
taking the complement (relative to X, xX ---  X,). M is closed with 
respect to these operations, and the other connectives can be expressed in 
terms of these two. 

(c) Quantifiers. It suffices to verify 4. This corresponds to projecting, 
because M is a big class. More precisely, let Y be M-defined by the 
formula 4y,4,P(},--->¥eYn¢) in X,X +++ X X,. We have: 


CY,---,Y E€VYeS 
there exists a Y,,, © M such that P(Y,,..., Y,4,) is M-true. 


To each <Y,,..., ¥,> EX, xX +--+ XX, we associate the least ordinal 
a for which there exists Y,,,€@ MV, such that P(Y,,..., Y,41) 1s 
M-true, if there is such a Y,,,. This gives rise to a function on Y CX, 
x +++ xX,. Let A be the set of its values, and let B= UA. Then 
X=MnV, is a set, and X C M. Since M is a big class, there exists 
X,,41 © M such that X C X,,,. By the induction assumption, the M-defin- 
able subset Y’CX,xX--- XX, XX,4, consisting of those points 
€Y\,.--, Yaay> for which P(Y,,..., Y,,4,) is M-true, belongs to M. But 
Y = F,(Y’), and M is closed under F,. 

The proposition is proved. O 


In order to be able to use Proposition 2.3, we need criteria for verifying 
M-truth. As remarked in §7 of Chapter II, the basic technical tool for this 
is the notion of absoluteness. A formula P is called M-absolute ((M, V)- 
absolute in the terminology of Chapter II), if |P|,,() =|P|,() for all 
~£€ MCV. The standard method of proving that a formula is M-true is to 
prove that it is V-true and M-absolute. 

The following lemma provides us with a large class of M-absolute 
formulas. 


2.4. Lemma. 
(a) Atomic formulas are M-absolute for all M. 
(b) If the formulas P, P,, and P, are M-absolute, then so are the 
formulas —P and P, * P, (where * is any connective). 
(c) Suppose that the class M is transitive, and is closed with respect to 
an operation f of degree r. If the formula P is M-absolute, then the 
“restricted quantifier” formulas 


Vx(x Ef(y,,---,¥)>P), 
Ax(xE f(y... YIAP) 


are also M-absolute. 
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Proor. Part (c) is the only assertion that might not be completely obvious. 
Before proving it, we make one remark. The formula x € f(y), .-..,) 18 
written in a suitable extension of LSet, and may be assumed to be 
V-equivalent to some formula P(x, y,,...,y,) in LSet (with constants in 
M) for which Wy,,..., Wy, a!x P or a restricted version of this formula is 
deducible from the Zermelo—Fraenkel axioms. This P determines the 
operation f. We also allow the case r = 0; then f is simply a constant in M. 
We shall identify f with its standard interpretation, i.e., we shall denote 
terms by f(Y,,..., Y,)}EM for Y,,..., Y. EM. 

Now let 6M, yS=¥,EM, Q=4x(xEf(y,,.--.¥) AP), Y= 
F(%,..-, YM. Then 


|O|4(€) = sup (\X € Yiu |Pla(&)), 
XEM 


where the & € M are variations of ¢ along x such that x*’ = X. Since P is 
absolute, it follows that |P|,,(¢’)=|P|,(é), and since M is transitive, it 
follows that, if X ¢ M, then |X € Y|,4,=|X € Y|, =0. Hence, on the 
right we can write V everywhere in place of M and can let é’ run through 
all variations of ¢ along x in V with x* =X. The resulting expression 


equals |Q| ,(§). 


The quantifier V can be handled analogously, or else can be reduced to 


A. The lemma is proved. O 
We shall abbreviate the restricted quantifier formulas in 2.4 (c) as 
(WxEf(y,---.¥))P, (Ax EfOn---.¥))P 
respectively. 


If all the quantifiers in a formula Q are restricted in this way, we say 
that Q is a X)-formula. 

As a first application of the results in 2.3 and 2.4, we prove the 
following fact. 


2.5. Proposition. A// ordinals are constructible. 


Proor. Suppose that this is not the case, and that # is the least noncon- 
structible ordinal. All of the elements in 8 are contained in L,. Since L is 
transitive, it follows that all y > B are nonconstructible. Hence, 

B = {x|(x is an ordinal A x € L,) is V-true}. 
If we show that “V-true” may be replaced by “L-true” here, we im- 
mediately have a contradiction, since then B € L by Proposition 2.3. 

To do this, it suffices to verify that the formula “x is an ordinal” is 
L-absolute. Using the regularity axiom, from which —(y € y) is deduc- 
ible, we can write this formula in the following 29-form: 

(Vy Ex)\(Wz Ey)(z E x) 
A(Wy, © x)(Wy2 E x), E¥2VI2 EV VI = 2) 


and then apply Lemma 2.4. O 
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3 The constructible universe as a model 
for set theory 


3.1. Theorem. The Zermelo—Fraenkel axioms are L-true. 


Proor. The general principle for verifying the axioms is to note that every 
set whose existence is stipulated in a given axiom can be represented as a 
set defined by a Zo-formula with constants in L. We only occasionally 
have to perform a direct verification that a subformula is L-absolute. 

(a) Empty set. This axiom is equivalent to the X)-formula 44x(x € @), 
which is V-true. 

(b) Extensionality. This axiom can be represented in Do-form. In addi- 
tion, in subsection 4.8 of Chapter II we verified this axiom for any 
transitive class. 

(c) Pairing. A direct computation of the L-truth function gives 1, since L 
is closed with respect to forming pairs. 

(d) Regularity. This follows by a direct computation using the transitiv- 
ity of L. 

(e) Union. Here it is somewhat more complicated to reduce the axiom to 
a 2,-formula. The axiom is written in the form 


Vx ay Wu(Az(ueEzAzeExj)oucey). 
Let €€ L, let & be any variation of € along x, and let ¥ = x* EL. We 
must show that 
|Ay Wu(Az(ueEzAzeX)eou Ey), () =, 
It suffices to find a Y € L such that 
Wu(Az(ueEzAzEX)aue Y)| = 1, 
i.e., such that for all UE L 
(Az EX)\U €z)|,=|UE Y|,. 
We can clearly take Y= U7 -,Z if we show that Y is constructible. Since 
L is transitive, we know that all the elements of Y are constructible. Hence, 
there exists a constructible set Y’ such that Y’> Y. Then Y can be 
represented as follows (where we replace V-truth by L-truth using Lemma 
2.4): 
Y={U|UEY’; (Az © X)(U &z) is L-true}. 
Now the required assertion follows by Proposition 2.3. 
In_what follows we shall usually omit explicit mention of the points 
gEL. 
(f) Power set axiom Wx Ay Wz(z Cxeoz Ey). We fix X € L, form the 


set Y= P(X)ML of constructible subsets of X, and show that Y is 
constructible. In fact, let Y’> Y, where Y’ is constructible. Then by 
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Lemma 2.4 
Y={Z|Z EY’; (Z CX) is L-true}, 
because Z C X has the X)-form (Wz € Z)(z © X). Now a direct computa- 
tion gives 
\Wz(z CXe@zeyY)|,=1. 
(g) Infinity. This axiom is L-true because of the constructibility of the 
set {@, {@}, {{@}},... }, which can be represented in the form 
{YIYEL,; [Y=OV (ay EL, (Y= {})] is Ltrve}. 
(h) Replacement. Let Z = <z,,..., Z,>. This axiom is written in the form 
Vz Wu(Wx(x €u=A!lyP(x, y, Z)) 
=>4w Wy(y Ewedx(x Gu/A P(x», z)))). 


We fix Z,,...,Z, EL, Z=<¢Z,,...,Z,>, and U EL. It is sufficient to 
consider the case when the premise is L-true, i.e., when, for all X € L, 


|x € U>alyP(X,y, Z)| =. 


We must find a value W € L of w for which the conclusion is L-true. We 
set W’ =a constructible set containing as elements all constructible Y for 
which 


(Ax € U)P(x, Y, Z ) is L-true. 
This set exists because, since the premise of the axiom is L-true, it follows 
that each Y¥ € U corresponds to at most one constructible Y. We then set 


W= { Y|\Y EW’; (Axe U)( P(x, Y, Z)) is L-true}. 


This set is constructible by Proposition 2.3, and it follows from the way it 
is defined that 


ae E Weedx(xeUAP(x,y, Z)))| =1. 
L 


(i) Axiom of choice. The main intuitive point in the verification is the 
numbering N of the universe L that was constructed in 1.8. But the formal 
verification is much more complicated here than in the previous cases. A 
fair amount of work is needed to give a formalization of the construction 
in 1.7-1.8 which is sufficiently detailed to prove the following fact: 


3.2. Proposition. There exists a formula N(x, y) in L with two free variables 
such that 
(a) For any X, Y © V, the formula N(X, Y) is V-true if and only if X is 
an ordinal and Y = N(X). 
(b) N(x, y) is L-absolute. 
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We shall postpone the proof until §5, and shall make use of this 
proposition to verify the axiom of choice. We divide this verification into 
two steps. 


3.3. UNIVERSAL CHOICE FUNCTION. Let X € L be a nonempty set. We 
construct the function Y which for every nonempty Z © X chooses the 
element U in Z with the least N-number (see 1.8): 


y= {cz, UIZEX,UE U X; UEZAAaW(N(w, U)AVW2(z2 EZ 
X'EX 


=(z=UVVw(N(w, z) >We w’)))) is v-srue}. 


We want to prove that Y € L. By Proposition 2.3, this holds if we can 
define Y by means of the L-truth of a formula. We are not allowed 
mechanically to replace V by L, since it is not immediately obvious from 
its external form that this formula is L-absolute. We proceed as follows: 
taking into account the constructibility of the ordinals, we take all ordinals 
which occur as the least N-numbers of the elements of the constructible set 
U yexX’ = U(X), and we find a constructible set W which contains 
these ordinals. Then we replace dw by Jw € W and Vw’ by Vw’ € W in 
the formula. The set Y does not change, and now V-truth may be replaced 
by L-truth, as can be seen by using Proposition 3.2 and Lemma 2.4. 


3.4. We now compute the L-truth value of the axiom of choice: 
Wx(x So @=Ay(y is a function (dom y = x 
A(Wz Ex\(z#¥O>y(z)E z))). 


It suffices to show that, if we take a nonempty X € L and the constructible 
choice function Y € L in 3.3, then 


|Y is a function|, = |dom Y= X|, =|(Wz EX )(z#¥O>Y(z)E z)[,=1. 


The third formula here is V-true, and is written in D)-form except for the 
subformula Y(z)€ z, which can be replaced by 
(Wu € U(Y))(<z, ) € Yu Ez). Thus, the third formula is L-absolute 
and hence L-true. 

We verify that the first two formulas are absolute in §5. They are V-true 
by construction. This completes the proof of Proposition 3.1. oO 


We note that the same argument shows the following: all the axioms, 
with the possible exception of the axiom of choice, are M-true for any 
transitive big class M which is closed with respect to the operations 
y raraeeere 2 
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4 The Generalized Continuum Hypothesis 
is L-true 


4.1. We wish to show that the assertion “card P(w,) =,4,” is L-true. A 
certain amount of caution is essential here, because cardinality is not an 
L-absolute notion. If Y is a constructible set, let card,(Y) be the least 
ordinal B for which there exists in L a one-to-one onto function f: YB. 
Hence “card (Y) = card (Z)” is L-true iff card, (Y) = card, (Z). Note that 
although card, (Y) > card (Y), equality fails if there are one-to-one onto 
functions Y—> £B in V, but no such function lies in L. The cardinal w, in L 
is the ath ordinal B > wy such that card,( 8) = 8. Thus w, in L may not 
coincide with the “real” w,, that is, with w, in V. 

We shall show that for each ordinal B and each constructible X c B 
there is an ordinal y with ¥ € L, and card, (y) = card, (f). Hence PB) 
ALCLg+, where B* is the least ordinal greater than £ such that 
card,(B*)#card,(B). The L-truth of the Generalized Continuum 
Hypothesis will then follow if we show the L-truth of “card (B*)=B*.” 

Our proof exploits throughout a proposition that requires a good deal of 
work formalizing the construction of ZL within L,Set. 


4.2. Proposition. There exists a formula L(x, y) of L,Set with two indepen- 
dent variables such that 


(a) for any X and Y in V, L(X, Y) is V-trueeY is an ordinal and 
X ELy; 

(b) for any transitive model M C V of the axioms (without the axiom of 
choice), the formula L(x, y) is M-absolute. In particular, it is L- 
absolute. 


We again postpone the proof until §5. 


4.3. Lemma. Let X C B be constructible. Then X © L, for some ordinal y 
such that card, (y) = card, (8). 


ProoF. In this deduction, in addition to Proposition 4.2 we use versions of 
Propositions 7.3 and 7.6 of Chapter II which apply to the constructible 
universe. They are formulated precisely and proved below, in subsections 
4.5 and 4.6. 

Suppose that X C # is constructible. Let 6 be an ordinal such that 
X € L,. We enlarge the alphabet of L,Set by adding names 6 and X for 6 
and X. Let & be the set of formulas 


{axioms of L,Set} U { L(x, 5)}. 
Let Nyc L be the set BU{X}U {8}. By Proposition 4.5 there is a 
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constructible set N such that NyCN, all formulas in & are (N, L)- 
absolute, and card,(N)=card,(8). Thus (N, ©) is a model for the 
axioms and, by Proposition 4.2 (a), for L(X,5). Now N might not be 
transitive, but then by Proposition 4.6 there is a transitive axiom model 
(M,e«) and a constructible isomorphism f:(N, €)->(M, e«). Hence 
L(X, 5) is M-true and card, (M) = card, (N). What are the interpretations 
of the constants X and 6 in M? 

Since the set 8 C N is transitive, it goes to itself under the isomorphism 
f; hence so does the set X C B. Let 5,, be the image of 6 under f. Since by 
Proposition 4.2(b) the formula L(x, y) is M-absolute, and L(X, 5) is 
M-true, it follows that L(X, 5,,) is V-true, so that 5,, is an actual ordinal 
and X € L, . Moreover, since 6,, © M and M is transitive, 54, C M; hence 
card, (d,,) < card,(M). Letting y be the larger of 5, and 8, we have 
card, (y) = card, (8) and X € L,. The lemma is proved. oO 


4.4. DEDUCTION THAT THE GCH Is L-TRUE FROM THE LEMMA. Let B* be 
the smallest ordinal greater than 8 such that card, (8 *) # card, (8). Then 
Lemma 4.3 implies the V-truth of the formula 


Vz(z EL=(zC Bz E Ly-)). 
Since “z € Lg.” (ie., the formula L(z, 8 *)) is L-absolute, it follows that 
Vz(zC Bz E Ly:) 


is L-true. Now if £ is the cardinal w, in L then £* is the cardinal w,,, in 
L. Hence for each a we have shown the L-truth of 


P(w,) CL, 


a+ 1" 


We claim that the following formula is also L-true: 

card(L,, ) = i 
Since “card(?(w,)) < w, 4,” is formally deducible in L,Set from the pre- 
ceding two formulas, and since all the axioms are L-true, this will show 
that the GCH is L-true. 

Our claim is verified thus: In subsection 1.4 we proved that card(L,) = 
card(y) for each ordinal y. Indeed, that proof can be formalized in L,Set, 
using the formula L(x, y) of Proposition 4.2. That is, the assertion 
“Wy(card(L,) = card(y))” is deducible from the axioms (see 5.17). Since 
the axioms are L-true, this assertion is then L-true. But since “card(w, , ;) 
=w,+, 1s trivially L-true, the claim follows. This completes the proof. [J 


4.5. Proposition. Let & be a constructible countable set of L-true formulas in 
the language LSet, and let M, be a constructible set. Then there exists a 
constructible set M > Mo, card, (M) < card, (Mo) + wo, such that all of 
the formulas in & are (L, M)-absolute. 


Proor. The general scheme is the same as in subsection 7.3 of Chapter II, 
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but some additional precautions are required. The main point is to prove 
that, if P(x,y),y=(,..-,¥,), is a formula in &, then there exists a 
constructible set M D> My with card, (M) < card, (Mo) + w) which can be 
constructed constructibly from P and has the property that 3x(P(x, y)) is 
(L, M)-absolute. After this we must verify constructible closure over all 
PSG, 

We reproduce the construction in subsection 7.3 of Chapter II. We 
construct the set M; by induction. Let Y=<Y,,..., ¥Y,9€M,*--- X 
M;. We let M;(Y) denote the class {X|P(X, Y,,..., Y,,) is L-true}. We let 
MY) denote © if M,(Y) is empty, and MY) L, for the least a for 
which this intersection is nonempty, otherwise. Since L(x, y) is absolute 
(see §5), it is not hard to see that the function M,, dom M,=M,xX +--+ X 
M,, is constructible. Because the constructible axiom of choice holds in L, 
we can obtain a constructible function F; by choosing one_element from 
each nonempty M,(Y). Let N, be the set of values of M,. This set is 
constructible, since all of our constructions are absolute; and, if M, is 
infinite, then card, (N,) = card, (M;). We set M;,, = M,UN, and M=uU 
M,. The set M has the required properties; obviously, card, (M) + w) = 
card, (Mp) + wy in L. The formal transition from {M,} to M is realized by 
considering a function which “closes” Mo, as in subsection 5.11 below. [ 


4.6. Proposition. For every constructible set N such that the extensionality 
axiom is N-true there exist a unique constructible transitive set M and 
isomorphism f : (N, © )->(M, é). 

Proor. The plan of proof is the same as in subsection 7.6 of Chapter II. 

First let “f is a continuous (a + 1)-sequence” be the formula “a is an 

ordinal”A“f is a function”?Adom f=a+1A(V6 €at+1\(8 a limit 

ordinal>/f( 8) = U yep J (y)). This formula is shown to be L-absolute as 
in subsection 5.14 below. Now consider the L-absolute operation $(Z) = 

{X|X ENA XN CZ}, and let @y be the unique member of N such 

that @,y NN =@. Finally, let Y(x, y) be the formula 


(Af)(“f is a continuous (x + 1)-sequence” A f(0) = Gy A 
«(VB E x)(f(B + 1) = 6(f(B))) Ay =f(2)). 


Then wy is L-absolute, as can be shown as in subsections 5.14 and 5.15 
below, and y(x, y) is L-true if and only if y = N, in the sense of Chapter 
II, subsection 7.6. 

We now set N= U.N, = {z|(Aa)(Ay CN (Ya, v)Azey)}. W 
show that N =N. Clearly N CN, and if N \N=Y were haneen it 
would follow by the regularity axiom, which holds in L, that 3Z(Z E YA 
Zo Y=@). For this Z we would have Z CN, hence Z CN, for a 
suitable a, so that Z © N,,,, which is a contradiction. 
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The implication ZC N=>3a(Z Cc N,), which we have used here, 
follows because there exists an absolute function on N which associates to 
each X the least a for which X¥ € N,. The replacement axiom shows that 
there exists an ordinal ao, namely, the least upper bound of the values of 
this function, for which N = N= Nag This ordinal, which is fixed for N, 
occurs in our subsequent construction, which is verified to be absolute as 
in §5. 

Let “h is a constructing (a + 1)-sequence for N, M” be the formula “A 
is a continuous (a + 1)-sequence”/Ah(0) = {(Gy, >} A“(WB E a) 
(A(B + 1) is a function dom h(B + 1) = Ng,, A the value of h(B + 1)) 
on any X € Np, , is the set of h( B)-images of elements of X 4 N).” Then 
for each a there is a unique such h; let M, be the image of A(a). For 
a = a we obtain a function h: N-+M=M,,, where M is our desired 
constructible set and / is a constructible €-isomorphism. 

The proposition is proved. oO 


5 Constructibility formula 


5.1. The purpose of this section is to prove Propositions 4.2 and 3.2. Both 
proofs are extremely straightforward, and simply consist in writing out 
explicitly the formulas L(x, y) and N(x, y) and verifying that the condi- 
tions in Lemma 2.4 apply. But since these formulas are very long, we 
perform the verifications in a series of “blocks,” in order to improve their 
appearance and to make the interpretation and verification of the condi- 
tions in 2.4 easier. As soon as a block (subformula) is constructed and its 
absoluteness is verified, we replace it by an abbreviated notation in the 
next formula. 

The material within each subsection is arranged in the following order: 
first the abbreviated notation for the formula which is being constructed 
and shown to be absolute in the subsection; then the complete form of the 
formula; and finally any remarks that may be needed regarding absolute- 
ness. The “complete form” of the formula may contain abbreviated nota- 
tion for subformulas. If such a subformula has not yet been interpreted in 
detail and shown to be absolute, this is done right after the complete form. 

By absoluteness we mean “M-absoluteness for any transitive model M 
for the axioms without the axiom of choice.” 

Subsections 5.2-5.15 are devoted to the formula L(x, y), and subsec- 
tions 5.18-5.20 are devoted to the formula N(x, y). As the material we are 
dealing with accumulates, we shall allow ourselves to omit more and more 
details and to rely on the reader’s experience. 

The formulas 


Fi(x,y), i=1, 2,3; 
ae 
F,(y), J =4, 5, 6, 7, 8. 
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5.2.2={x,y}: WuezuHxVu=y)Ax Ezy EZ. This whole for- 
mula is clearly absolute by Lemma 2.4. From now on we shall not even 
comment on such simple cases. 


53z2=x\yp: Wuez(uEexAu€gy)AWu Ee xu ¢y su Ez). 
5S4.2=xXy: Wu, ExWuy E y)(Ku;, uy) € z) 
A (Wu € 2)(Bu, € x), € yu = (uy, >) 
(uy, Wy Ez: (Av Ez\(v = <u; uy>); 
u=uy,u>: (Wo Eulv= {u,} Vo= {uy %}) 
A {ui} Eu A {ity Wy} Eu; 

{uu} Eu: (Av Eulo= {u, uy}). 
5.5.Z=F,y)=domy: (Wu€z\(ave U U(Y))Ku, 0) Ey) 

AWue U UY) Ee U UO))Ku, 0) Ey su 2). 


Here UU appears because <u, v> = {{u}, {u, v}} Ey>u,oe UU 
(y). This formula is absolute, since a transitive model is closed with respect 
to the operation U (see 3.1(e)). We shall write U?=U U, and so on. 


5.6. z=F,(y): (Wu € z)\(Av E yaw Eylv Ew Au =(0, wo) AWv & 
yw EyK(v Ew, w> Ez). 


5.7. z= Fly): (Wu € zu, € U(y) Am, € U*))Gus € VOC 
Uy, Us EY Au = Cus, Uy, Up) A(Wu, € Uy) (Wen € U4”) Wu; € 
U*(y))(Kuy, Ua, U3> EY => (Uy, Uy, U> Ez). Here U 4 appears for the same 
reason as LU” in 5.5. The formulas<w,, u,, u,> Ey, etc., are shown to be 
absolute in the same way as in 5.4. 

The operations F, and Fy are treated analogously to F¢. 


The formulas 


_ F/ (xxx), fori=1, 2, 3; 


J. 2 wo _— 
F,” (x), for j = 4, 5, 6, 7, 8. 


5.8. y = F,'(x X x), i= 1, 2, 3: 
(Wu € y)(Au, € x)(Au, € x)(u = F,(u, up) 
A (Wu, & x)(Wuy € x)( Fi (uy, ty) Ey), 
where F,(u), u.) Ey: (Av E y)(v = F,(u), u)). 
5.9. y= F(x), J =4,...,8: Wu e yav € x)\u= F/'(v)) Av € x) 
(F"(v) € y). 
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5.10. y = ®(x) (see 1.4): 
(Wz Ey(zExVzE Fl (xX x)V--+ Vz Fy (x))A(Wz € xz Ey) 


A(Wz © Fy (x X x)(z Ey) A+++ A(Wz © Fey (x) (z Ey). 


The class L is closed with respect to the operations F,”. In fact suppose, 
for example, that i > 4, and let ¥ © L. Let U € L be a set containing all 
FY) for Y © X. Then 


Fi (X)={Z|Z €U, (Ay €X)(Z = F,(y)) is V-true}. 


Since the formula Z = F;(y) has been shown to be absolute, we may 
replace “V-true” by “L-true” here, and then apply Proposition 2.3. Thus, 
the formula y = ®(x) is L-absolute by Lemma 2.4. 

If M is an arbitrary transitive model, then the verification that M is 
closed with respect to F,” is somewhat different. Namely, the formula 
Vx Aly(y = F"(x)) is obviously V-true. The formal deduction of this 
formula does not use the axiom of choice. Hence, the formula is M-true for 
any transitive model M. We therefore have YE M if X © M, where 
Y = F,"(X). We shall use this device many times in what follows. 


5.11. “g closes x,” which is short for: “g is a function on wo, and 
g(n) = ®"(x) for all n Ew.” We write the formula with the constant w 
and the free variables g and x: 


“g is a function” AF, (2) = wo /A g(0) = x 
A (Wn E wp) g(n + 1) = @(8(n))). 


Here: 
(a) “g is a function”: 


(Vue g)(Au, € U(g))(Au, el *( g))(u = (uy, U2») 
A (Wu, € U(g))(Wuy € U(g))(Wu, € U(8)) 
(Cu, uy> Eg A Cu, Uy> E guy = Uy). 


(b) g(0) = x: @, x) Eg. 
(c) s(n + 1) = ®( g(n)): 


(ay € U(g))(Kn, > ER Anu {n}, B(y)> Eg), 
where 
<nU{n}, Wy) Eg: (Aue U(g))(Av0€ Ug) 
(u=nU {n} Av= ®y) Au, o> € g). 


Since wy € M, the formula 5.11 is now easily seen to be absolute by the 
previous results. 

In 5.11 we took the liberty of using g and n for variables of L,Set in 
order to make the formulas intuitively clearer. In what follows we shall 


165 


IV The continuum problem and constructible sets 


also use a, 8, K, and N as variables, thereby temporarily ignoring our 
convention of only using small letters at the end of the Latin alphabet. 


5.12. y € $x: Ag(“g closes x” (An E wo)(<n, y> € g)). Here the quantifier 
over g is not restricted. Since the formula under the 3g sign is absolute, we 
may conclude directly from the definition | |, (¢) =| |,(§, €€ M, that 
y €§x is also absolute, provided we show that, for any X € M, the 
function G € V which closes X lies in M. The formula Vx 2! g (“‘g closes 
x”) is obviously V-true. If we formalize the verification of this fact, we see 
that this formula is deducible from the axioms without the axiom of 
choice. Hence it is M-true. This implies that for any X¥ € M we have 
GEM. 


5.13. y E P(x)N G(x U {x}): (Wz Ey)Wo E z)(v E x) Ay € §(x U {x}). 


5.14. “fis the constructing (a + 1)-sequence,” which is short for: “a is an 
ordinal” \”f is a function” dom f=a+1A(WB Ea + 1)(f( 8) = Ly). 
Here: 


(a) (WB Ea + I) f(B) = Lp): 
(WB Eat 1)((fisa limit ordinal f(B)= U,e, f(y) 
A(F(B +1) = 9(F(B)) 0 S(F(B) U (F(B)}))). 
(b) “B is a limit ordinal”: “ is an ordinal” \(Wa € B)(B#aU {a}) 
(©) {(B)= Uyegf): (Av € U*(S))(0 = UepfMACB &> ES); 
O= Uyep f(y): (Wu € v)(Ay € B)(u E f(y) 
A(Wue U*(S) (UE f(y) =u Ev); 
uEf(y): (Awe U*(S))(Ky, w> EfAu Ew). 
(A) F(B + I) = P(F(B)) 9 9(F(B) U {F(B)}): 
(Aue U(/))((B+ 1 EFA (Wo Eu) 
(v= P(f(B)) 9 S(F(B) U {f(B)})) 
AWo(v € P(F(B)) N S(F(B)U {f(B)}) 30 Eu); 
ve P(f(B))N H(F(B)U {f(B)}): 
(Aue UA) (KB, > EfAv E P(u)N S(wu {u})). 
Finally, in order to verify directly that the subformula 
Vo(v € P(S(B)) 9 I(F(B)U {f(B)}) 0 Eu) 
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is M-absolute, it suffices to show that M is closed with respect to the 
operation X+> P(X) 4(X U {X}). But M is closed with respect to both 
§ and Xb» P(X)N M, so the verification is complete. 


5.15. L(x,y):  “y is an ordinal and x € L,”: “y is an ordinal” A\Af(“f is 
the constructing (y + 1)- -sequence” /\(4z EU ANY», zZ€EfAx €2z)). 
Since the quantifier 4f is not bounded, in order to verify this last absolute- 
ness statement we must show that the constructing (Y + 1)-sequence F is 
an element of M for any ordinal Y in M. We use the same argument as in 
5.12: the formula Vy(y is an ordinal J!f(f is the constructing (y + 1)- 
sequence)) not only is V-true, but also is deducible from the axioms 
without the axiom of choice; therefore it is M-true. 

This completes the proof of Proposition 4.2. 0 


5.16. Remark. The formula Vx dy L(x, y) is often written in the form: 
V=L, and is called the axiom of constructibility. The absoluteness of 
L(x, y) implies that the following formula is L-true: 


Vv = i = inf L(X, Y)|p=l. 
Maa BOs oop cml sup ECS hee is cup IPG Ne 


Hence, this formula is consistent with the Zermelo—Fraenkel axioms. On 
the other hand, V = L implies the Generalized Continuum Hypothesis 
(GCH), and, since the negation of the GCH is also consistent with the 
Zermelo—Fraenkel axioms, it follows that —(V = L) is consistent with the 
axioms. 

We now proceed to the proof of Proposition 3.2. This proof follows the 
same plan as the proof of Proposition 4.2. We return to the conventions 
and constructions in 1.7-1.8. 


5.17. Remark. In subsection 4.4 we exploited the fact that the assertion 
“a > wo=>card(L,) = card(a)” is formally deducible from the axioms of 
L,Set (without the axiom of choice). We may now see that such a formal 
deduction can be obtained by exactly mimicking the proof in subsection 
1.4. Indeed, from the definition of L(x, y) we have the formal deducibility 
of “Lay, = PCL) G(L, U {L,})” and “8 a limit ordinal L, = 
U,epgl,”. Moreover, the following are deducible: “card(X) < wo=> 
card(X) < card(P(X)) < wo” and “card(X) > wy = card(9(X)) = 
card(X)”. As a result, the assertions “card(L,,) = wo”, “card(L,) > 
Wy => card(L,,,) = card(L,)” and “B a Gait ordinal => 
card(Lp) = card( Se ala) are all deducible. And from these and the 
axioms of L,Set the desired assertion may be deduced (using, in particular, 
the deducibility of “card(w9) = wo”, “a > wy=>card(a + 1) =card(a)”, “B 
is a limit ordinal=> 8 = U yes and in addition an instance of transfinite 
induction on the ordinals, which is of course also formally deducible in 
LSet). 
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5.18. The formula H(K, x): K is a function A x is an ordinal \.dom K = x 
+1/A K(0)=<0, 0, 0> A\(Wy © x + I)(K(y) is an important triple A K(y 
+ 1) is the next important triple after K(y)) A(y is a limit ordinal > K (y) 
=lim,<, K(z)) is absolute. 

We shall not analyze the subformulas which have been considered 
before. The following subformulas remain: 


(a) “K(y) is an important triple A K(y + 1) is the next important triple 
after K(y)” 
(b) K(y) = lim, <, K (2). 


We shall have to use the absoluteness of the auxiliary formula “y = 
Xj which is short for: “x is an important triple (i.t.) and y is the ith 
coordinate of x,” where i = 1, 2, or 3. That is: 


(au, EW *(x))(Au, EU *(x)) (Au; E U (x)) 
X (x = (uy, Uy, U3> Au, is an ordinal Au, < 8 
/\u, is an ordinal Au, is an ordinal A y = u,). 
The complete form of (a) 1s: 
(Aue U(K))(av € U(K))((y WE KAGVY+LwEK 
Auisani.t.A\v is the it. after u). 


According to Lemma 1.7(a), “wu is an it.A\v is the it. after u” can be 
written in the form We ,C,(u, v), where C;(u, v) is the formalization of the 
ith alternative in 1.7(a). For example, 


C}: u is an LtAv is an LtAuay < 7A vay = Yay + 1 

Aba) = May /\ V3) = Hy 
Cy: uisanit.Avisani.t.Aug, =8 Aug +1 <u) 

Qa) = 0A vQ) = Ua) +] /\ ©) — 0. 
The other C, are analogous, and are absolute for the same reasons. 


The complete form of (b). Here we need to know that the following 
auxiliary formulas are absolute: 


u= U K(z),),i=20r3: (Wo Eu)(az Ey)(v = K(z)) 


zey 
A(Wz E y)(Av E u)(v Cae K(z) i); 
v= K(z)): (Awe U(K))(<z, > EK Awisani.t.Av = wi). 
Then, using Lemma 1.7(b), we explain the formula K(y) =lim,.,K(z) as 
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follows: 


K(y)ay = 


4 
0A Au, 3in(u= U K(z)) /\u3 = U K(Z)ayA MV D,(t, =) 


zey zey 


where the alternatives D, have the following structure, depending on “how 
K(z) approaches K(y)”; 


Dj: uy © uz A uy is a limit ordinal A ((Az € y)( K (z)3) = 43) 
> K(¥)a) = A K(¥)3) = U3); 
Dy: Uy = U3 A uy isa limit ordinal A ((Az € y)(K (z)3) = 43) 
A(Wz Ey)(K(z)g) Eu) > K(y)a)=wAK(y)a)= 0); 
D3: Uy > u3 A us; is a limit ordinal A ((Wz € y)(K (z)g) € 43) 
> K(y)ay = uy © K(y)a) = U3); 
Dg: uy = Uy A uz is a limit ordinal A ((Wz € y)(K (z)a) € 4 


AK (z)3) © 43) > K (Vay = 0A K(y)a = U3). 


It is therefore obvious that the D, are absolute. Even though the 
quantifiers Ju, and Ju, are not restricted, there is no problem, since, when 
K*, y§ © L, this formula can only be V-true if u§ and u§’ are uniquely 
determined ordinals, and lie in L, which gives us L-truth. 


5.19. The formula S(N, x): “x is an ordinal A N is a function dom N 
=x+1lAWy <x+1)(N(J) is a constructible set with N-number y)” is 
absolute. 

We shall need to know that the following auxiliary formula is absolute: 

y=(x), i=1,2,3, where K(x) =<(x),, (x2), (x)> 

(not to be confused with the formula y = Xj in 5.16, which occurs here as 
a subformula): x is an ordinal \ 4K (H(K, x) A (au € U(K)){Kx, Oo EK 
/\Y = Wj )). Even though 4K is not restricted, this does not cause any 


problem, because, for every ordinal x* € L, the value of K* making 
H(K®, x*) V-true lies in L. In fact, the V-true formula 


Vx(x is an ordinal => 4! K (H(K, x))) 


is deducible from the axioms without the axiom of choice, and hence is 
L-true. 

We now return to S(N, x). We need only show that the subformula 
“N(y) is a constructible set with N-number y” is absolute. By definition, 
this subformula can be written as WOK y, N), where the alternatives 
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have the form: 


Q: (¥) =9ACY, Lyn EN; 
Q,1<i<3: (»),=iAdy, F(N((y)) N(()s)) ENS 
Q,4<1<8: (y),=iAdy, F(N((),))> EN. 


The absoluteness of the subformulas that have not been analyzed is 
clear from the following complete forms of these formulas: 


(a) (y, Ly EN: (Az EUN)KY, 2ENAZE Luy,): 
z=Ly: (duey+ Du=(y)Az= L,); 
z=L,: WvezyvE Ly) AVo(e EL, >0 €z). 


We can verify directly that the last subformula, with the unrestricted quanti- 
fier Vv, is absolute, since Ly © L for any ordinal U, and L is transitive. 


(b) <y, F(N(y) PEN, 1=4,..., 8: 
(du, vo, WE UN) )(u = (vy, Accu, 2 EN Aw = Fi(v) A<y, wp EN). 
(c) <¥, F(N()2) N()3)) 9 EN, = 1, 2, 3: 
(uy, us, Oz, v3, WE U(N)) (ug = () A ty = ()3 A Katy, 02 EN 
Muy, 03> E N/A w= F,(v2, 3) \Cy, w> EN). 


5.20. The formula N(x, y): “x is an ordinal Ay = N(x)” is absolute. 
In fact, this formula is written in the form 


AN(S(N, x +1 Ax,» EN). 


There is no problem with AN being unrestricted, since we can apply the 
same type of argument as we have used many times before: for any ordinal 
x® there is a unique N* making this formula V-true, and then N* € L, 
since the formula Vx (x is an ordinal=>3! N(S(N, x + 1))) is deducible 
from the axioms without the axiom of choice, and hence is L-true. 

This completes the proof of Proposition 3.2. oO 


6 Remarks on formalization 


Gédel’s theory, to which this chapter is devoted, is usually presented in a 
more syntactic version. We shall now briefly describe the system of basic 
ideas and the most important changes in the proofs in this version, in 
which the least possible appeal is made to the semantics. 


6.1. Let Q(x) be a formula in L,Set with one free variable x. Let ZF be the 
set of all the (logical, special, and equality) axioms of L,Set except for the 
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axiom of choice. Q(x) is said to be transitive if 


ZFL (Q(x) Ay € x) > Q(y). 


6.2. The relativization Pg of a formula P in L,Set relative to Q is defined 
by induction on the number of connectives and quantifiers in P: 


(x Ey)g is ONAQ() =x Ey; 
(x=Y)g is ONAQY)>x =); 
(P)g is (Po); 
(P, * Py)o is (Pi)o * (P2)g, for any connective * ; 
(VxP)o is Vx(Q(x)=> P); 
(AxP)o is Ax(Q(x)/ P). 


6.3. Q(x) is called an (internal) model of LSet if for any axiom P € ZF we 
have 


ZF} Po. 
This model is transitive if Q is transitive. 
A formula P(y,, ...,y,) 1s called Q-absolute if 


ZFL(Q() A+++ AQ(,)) = (PS Po). 


6.4. The connection between these concepts and our earlier ones is as 
follows. Every formula Q(x) determines a class M={X E V|Q(X) is 
V-true}. This class M has the property that 


|P| (&) = | Pol), VEE M, 


for any formula P (as can easily be proved by induction on the number of 
connectives and quantifiers in P). Thus, to give a syntactic reformulation 
of our proofs we must make the following changes throughout; 


(a) We only consider classes M which are defined by formulas Q, and all 
references to M are replaced by references to Q. 

(b) We everywhere replace “P is V-true” by “P is deducible from ZF.” 

(c) We everywhere replace “P is M-true” by “Po is deducible from ZF.” 

(d) We everywhere replace “P is M-absolute” by “P is Q-absolute.” 


In order for the new assertions on deducibility from ZF to become 
sufficiently obvious, we must either do some additional work formalizing 
the proofs or else give more careful intuitive proofs. In particular, we must 
find finite subsets of ZF from which the various facts are deducible. The 
basic results are stated as follows in the new syntactic language: 


6.5. dy L(x, y) “is” a transitive internal model of L,Set. 
6.6. ZFL-(Axiom of choice)s, 7x, y): 
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6.7. ZF|-(Generalized Continuum Hypothesis)s, 1x. ,)- 


6.8. Thus, a completely syntactic version of Gédel’s theory would consist 
of all the deductions implicit in 6.5-6.7, without any commentary. Of 
course, such a treatment has never been written. The formula Ay L(x, y) 
alone takes up several pages; without appealing to semantics, it would be 
impossible either to think up, or to explain, or even to copy down all this 
without making mistakes. The deductions of all the required relativized 
formulas P3, ;(,,,, Would also be extremely long. This situation gives us an 
instructive example of what was discussed in “Digression: Proof” in 
Chapter II. 


7 What is the cardinality of the continuum? 


After all we have learned about the Zermelo—Fraenkel language and axiom 
system, it might seem naive to return to this question. But we must do so if 
we consider mathematical meaning to be our primary concern. 

Some specialists in the foundations of mathematics espouse a different 
point of view. Namely, they answer that the question itself is meaningless. 
It seems that Paul Cohen himself tends toward this viewpoint, at the same 
time admitting that “this is a hard decision” (P. Cohen, Comments on the 
foundations of set theory, Proc. Symp. Pure Math., vol. XIII, part I, 
American Math. Soc., Providence 1971, p. 12). 

From this point of view it is natural to reject almost the entire semantics 
of L,Set, including all the V, starting with a = w) + 1 in the von Neumann 
universe. No half-way solutions can help matters, especially since ques- 
tions concerning higher axioms of infinity or the so-called “measurable 
cardinals” are in an even worse position than the CH. 

It thus becomes necessary to try to find alternative languages and 
semantics. Here the differences of opinion are wide and irreconcilable. The 
most clear-cut position is that of the constructivists, although even among 
them there are different shades of opinion. The constructivists do not 
recognize infinity as a usable concept, and reject noneffective existence 
proofs. (It turns out that in practice they often replace these noneffective 
proofs by a more carefully differentiated word usage—“there cannot not 
exist,” or “there quasi-exists’—which is nearly synonymous with certain 
linguistic precautions adopted in classical texts.) In our opinion, the 
shortcoming in their point of view is that constructivism is in no sense 
“another mathematics.” It is, rather, a sophisticated subsystem of classical 
mathematics, which rejects the extremes in classical mathematics and 
carefully nourishes its effective computational apparatus. 

Unfortunately, it seems that it is these “extremes”—bold extrapolations, 
abstractions which are infinite and do not lend themselves to a constructiv- 
ist interpretation—-which make classical mathematics effective. One should 
try to imagine how much help mathematics could have provided twentieth 
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century quantum physics if for the past hundred years it had developed 
using only abstractions from “constructive objects.” Most likely, the stan- 
dard calculations with infinite dimensional representations of Lie groups 
which today play an important role in understanding the microworld, 
would simply never have occurred to anyone. 

It is not impossible that a new (or a completely forgotten old) concep- 
tion of the continuum, in which the continuum has no “cardinality,” could 
be found in the course of a deep investigation of the external world. The 
notion of a set consisting of elements may actually only be adequate for 
finite or countable sets, and “higher infinities’ may turn out to be 
abstractions from objects of a completely different type. 

Physics seems to point up a difference in principle between “counting” 
and the Eudoxos—Dedekind idealization of measurement. The counting 
procedure applies to regions of attraction—“attractors” (R. Thom)—which 
are units not having sharp boundaries. The parts of a unit, even if they 
have physical meaning, are nevertheless attractors of a different sort. But 
even these ideas apparently stop making sense in the microworld. 

If nature has a fundamentally statistical aspect, it might be fruitful to 
consider mathematical models in which the statistical aspect appears as an 
undefined concept. The unexpected richness of the nonstandard interpreta- 
tions of classical mathematics in Boolean-valued models agrees with the 
suggestion that all the words we say should be understood in a new way. 


7.2. We now discuss a less radical point of view on the continuum 
problem, according to which this question of its cardinality is meaningful. 
Then the main problem once again becomes how to determine the place of 
the continuum on the scale of alephs. 

Cohen concludes his book with the following opinion: “A point of view 
which the author feels may eventually come to be accepted is that CH is 
obviously false... . C is greater than x,, &,,, 8, where a =X,, etc. This 
point of view regards C as an incredibly rich set given to us by one bold 
new axiom, which can never be approached by any piecemeal process of 
construction.” 

We thus have a conjectural estimate from below for C, and nothing 
more—not even a conjecture as to whether the cardinal C is regular or 
singular. 

Of course, the real problem consists not only in guessing a plausible 
conjecture, but in supporting it with sufficiently convincing indirect evi- 
dence for it to become widely accepted, even if not proved. What sort of 
evidence could this be? In discussing new axioms for set theory, Gédel 
writes: 


“there may exist... other (hitherto unknown) axioms of set theory which 
a more profound understanding of the concepts underlying logic and 
mathematics would enable us to recognize as implied by these concepts. 

“Furthermore, however, even disregarding the intrinsic necessity of 
some new axiom, and even in case it had no intrinsic necessity at all, a 
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decision about its truth is possible also in another way, namely, induc- 
tively by studying its “success,” that is, its fruitfulness in consequences 
and in particular in “verifiable” consequences, i.e., consequences demon- 
strable without the new axiom, whose proofs by means of the new axiom, 
however, are considerably simpler and easier to discover, and make it 
possible to condense into one proof many different proofs. The axioms 
for the system of real numbers, rejected by the intuitionists, have in this 
sense been verified to some extent owing to the fact that analytic number 
theory frequently allows us to prove number theoretic theorems which can 
subsequently be verified by elementary methods. A much higher degree of 
verification than that, however, is conceivable. There might exist axioms 
so abundant in their verifiable consequences, shedding so much light 
upon a whole discipline, and furnishing such powerful methods for 
solving given problems (and even solving them, as far as that is possible, 
in a constructivistic way) that quite irrespective of their intrinsic necessity 
they would have to be assumed at least in the same sense as any well 
established physical theory.” (K. Gédel, What is Cantor’s continuum 
problem?, Amer. Math. Monthly, vol. 54, no. 9, 1947). 


There is little to add here to this ardently expressed hope. But see §8 of 
Chapter VII, where it is shown using an idea of Gédel’s own that any new 
independent axiom can shorten to an arbitrary extent the proofs of 
suitable assertions which are provable without the axiom. This result 
somewhat weakens our confidence in pragmatic criteria for truth. 
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CHAPTER V 


Recursive functions and Church’s thesis 


1 Introduction. Intuitive computability 


1.1. The first part of this book was primarily concerned with mathematical 
proof; we showed that the analogous concept in formal languages is that of 
formal deduction, after which the most interesting results were that certain 
intuitive mathematical assertions (such as the Continuum Hypothesis and 
its negation) are not deducible. 

Our primary concern in the second part of the book is the notion of a 
determinate computational process, that is, the processing of information, or, 
briefly, the notion of an algorithm. In §2 we give a precise and presumably 
complete characterization of everything that can be obtained using com- 
putational algorithms. Then the most interesting results turn out to be 
assertions that certain intuitively defined functions cannot be computed by 
an algorithm (Chapter VI). 

Both the theory of proof and the theory of computation can be 
presented in a large part independently of one another. This is the 
approach we have adopted, even though it does not correspond to the 
historical development. But when the machinery of both theories has been 
developed to a certain point, it becomes possible to apply each theory to 
investigate the other. The third part of the book is devoted to such 
applications. 

In this section we describe informally the main focal points of the 
theory of computability. We appeal to the reader’s intuitive notion of 
algorithms, which can be conveniently used to illuminate the structure and 
interrelations of the basic concepts. 

When we make these concepts precise in the next section, we shall not 
give a description of the algorithms themselves, but rather of their results, 
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i.e., computable functions. The concept of an algorithm seems to lose too 
much in any formalization, while the notion of algorithmic computability 
seems not to lose anything essential. 


1.2. We now introduce several simple basic concepts. Let X and Y be two 
sets. A partial function (or mapping) from X to Y is any pair <D(f), f> 
consisting of a subset D(f) Cc X and a mapping f: D(f)— Y. Here D(f) 
(instead of the earlier dom f) is called the domain of definition of f; f is 
defined at a point x © X if x € D(f); f is nowhere defined if D(/f) is 
empty; and there exists a unique nowhere defined partial function. 

We let Z* = {1, 2, 3,... } denote the set of natural numbers, excluding 
zero. (It is not necessary, only convenient, to exclude zero.) If n > 1, we let 
(Z*)" denote the n-fold direct product of Z* with itself, i-e., the set of 
ordered n-tuples <x), ..., x,>, x, €Z*. It is convenient to let (Z*+)° denote 
the set consisting of an arbitrary element, denoted “-.” The basic objects 
of our concern will be partial functions from (Z*)” to (Z*)" for various m 
and n. When we classify these functions according to their computability, 
the reader can think of the word “program” as referring to a program for a 
universal computer which is written without regard to time or memory 
limitations. Here every program for computing a function has a special 
“blank space” in which to insert the value of the argument. 


1.3. The basic informal definitions. (a) A partial function f from (Z*)” to 
(Z*)” is called computable if there exists a “program” which, whenever a 
vector x € (Z+)” is entered in the input, gives as output 


f(x), ifxe D(f); 
0, ifx € D(f). 


Here O merely indicates that f is not defined at x; we could allow the 
output in this case to be anything not in (Z*)". 

(b) A partial function f from (Z*+)” to (Z*)" is called semi-computable if 
there exists a “program” which, whenever a vector x € (Z*)” is entered in 
the input, gives f(x) as output if x € D(f), and either gives 0 as output or 
else works infinitely long without stopping if x ¢ D(f). 

In particular, computable functions are semi-computable, and everywhere 
defined semi-computable functions are computable. 

(c) A partial function f is called noncomputable if it does not satisfy 
condition (b) (and a fortiori (a)). 


1.4. Comments 

(a) The most basic of these three concepts is semi-computability, since 
computability reduces to this property. In fact, to determine whether a 
semi-computable function is computable, we proceed as follows. 


178 


1 Introduction. Intuitive computability 


Let ¥ Cc Y be two sets. By the characteristic function of X in Y we mean 
the function xy : YZ? such that 


a= 1, ifxEX; 
Xx 2, ifx GX. 


Note that xy is everywhere defined on Y. 

Now let f be a semi-computable function from (Z*)” to (Z*)". If f were 
computable as well, then the characteristic function of D(f) would also be 
computable: simply add to the program which computes f the instructions 
“send 0 to 2, and anything not 0 to 1, and print as output.” Conversely, if 
X pp is computable, then so is f: in front of the program which semi-com- 
putes f, put the program which computes x py, and then the instruction to 
give 0 as output immediately if xp,p(x)=2 and to continue with the 
program for f with x as the argument if xp,:p(x)=1. Thus, since the 
everywhere defined function x p,, is computable if and only if it semi- 
computable, we have f is computable <=> is semi-computable and x p,p is 
semi-computable. Later, we shall first formalize the concept of semi-com- 
putability, and then take the right side of this equivalence as the formaliza- 
tion of computability. 

(b) There exist noncomputable functions. In fact, any program is a finite 
text in a finite alphabet, so that the set of programs is countable, while the 
set of all functions Z* ->Z* is uncountable. (For a critical discussion of 
this argument, see 1.5 below.) 


AN EXAMPLE OF A NONCOMPUTABLE FUNCTION. We consider the language of 
arithmetic SAr which was described in §10 of Chapter II, and number the 
formulas of this language as explained in §11 of Chapter II. We define a 
function f by stipulating that 


=1, if the xth formula is true in the standard 
f(x) interpretation ; 


is not defined, if the xth formula is false. 


The function f is noncomputable. In Chapter VII we shall see that this 
follows because the set D(f) is not definable in arithmetic, by Tarski’s 
theorem. 

In other words, it is impossible (even in principle) to distinguish the set 
of all number theoretic truths by writing a single program (even a very 
long and complicated one) which could tell from a statement’s formulation 
whether it is true. Of course, to prove this result requires a much deeper 
analysis of the concept of computability. 


(c) There exist functions which are semi-computable but not computable. 
We first give a typical example of a program which semi-computes a 


179 


V Recursive functions and Church’s thesis 


function. We consider the following function f from Z*+ to Z*, which is 
defined in terms of Fermat’s problem: 


=]; if there exists x, y, z €Z* for which 
f(n) nt? Sy ee a 
is not defined, otherwise. 


Here is a program which semi-computes f: after entering n in the input, 
run through all vectors <x, y, z> in a suitable order. (For example, accord- 
ing to increasing x +y+2z, and for given x+y+z, in lexicographic 
order.) For each such vector verify whether or not x"*7 + y"*? = z"*?. If 
this equation holds, give | as output; otherwise, go on to the next <x, y, z>. 

Hence, f is semi-computable. But it is not known whether or not f is 
computable. According to Fermat’s conjecture, f is nowhere defined (and 
hence computable!). The strongest theoretical results known concerning f 
—the so-called criteria of Kummer, Wieferich, Vandiver, and others—may 
be regarded as a sort of approximation to proving that f is computable, not 
that f is nowhere defined. That is, in order to verify the Fermat conjecture 
successively for various values of n, we must perform a (machine) com- 
putation (whose size grows rapidly with n) to determine x p,p at the point 
n, when this determination is possible. 

There is an analogous example of a semi-computable function which we 
actually know is not computable. In Chapter VI we prove that there exists 


a polynomial P(t, x,,...,.,) with integer coefficients such that the 
function 
=1, if the equation P(t, x,,..., x,) =0 is solvable 
g(t) with x,,..., x, €2% 


is not defined, otherwise, 


is not computable. This function is semi-computable by the same argument 
as in the case of the function connected with Fermat’s equation. 


1.5. Critical discussion of the above proofs. Before proceeding further, we 
consider from a more critical point of view, for example, the argument in 
1.4(b). The first weak point that catches our attention is that we did not 
say precisely what a program is. But this is not essential; for any fixed 
definition we choose, a program must in any case be a text in a finite 
alphabet if it at all corresponds to our intuitive notions, and there are 
countably many such texts. A much stronger objection to the argument 
goes roughly as follows: what justification do we have for working with 
just one definition of what a program is? Could there perhaps exist an 
increasing hierarchy of precisely describable “methods of computation,” so 
that, for every function from Z* to 2+ we could choose a corresponding 
program which could compute this function? 

A fundamental discovery in the theory of computability was that this 
last question has a negative answer. We now have a unique and final 
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formal notion which corresponds to the intuitive idea of semi-computabil- 
ity. It can be stated as follows: 


1.6. Church’s Thesis (weakest form). Jt is possible to give explicitly: 


(a) a family of basic semi-computable functions; 

(b) a family of elementary operations which, starting from any semi-com- 
putable functions, allow new semi-computable functions to be con- 
structed; 


with the property that any semi-computable function can be obtained in a 
finite number of steps, where each step consists in applying one of the 
elementary operations to the functions constructed before and those in the 
family (a). 


1.7. Comment. Church’s thesis will be given a precise formulation in the 
next section: the basic functions and the elementary operations will be 
given explicitly. The exact mathematical theory of computability begins at 
that point. But it seemed important to indicate first the general significance 
of the discovery that such families of functions and operations exist at all 
and can even be given explicitly, a result that is far from obvious. 

This is an experimental fact, one of the most important discovered by 
logic. In the next section we discuss evidence of its value and usefulness. 
Now we merely note that this fact is related to the finiteness of the basic 
logical and set theoretic principles of mathematics (implicit, for example, 
in L,Set), but is not identical to this finiteness. 


2 Partial recursive functions 


2.1. In this section we give the precise definition and the basic properties of 
a class of partial functions from (Z*+)” to (Z*)" which we take as an 
adequate formalization of the class of semi-computable functions. We give 
the definition in a way parallel to the statement of Church’s thesis in 1.6. 


2.2. The basic functions 
suc: Z*>Z*, suc(x)=x+1; 
1M (Zt Zt, M(x, ..-,%,) = 1, 1220; 
pr: (Zt Zt, pr(4,-.-.4%) =x, 22. 
2.3. The elementary operations on partial functions 
(a) Composition (or substitution). This operation associates to every pair 


of partial functions f from (Z*)” to (Z*)" and g from (Z*)” to (Z*)’ the 
function h = gef from (Z*)” to (Z*+)’ which is defined as follows: 


D(gef)=f—'(D(g)) = {x €(2*)"|x € D(f), f(x) € D(8)}; 
(gef)(x) = (f(x). 
18] 
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(b) Juxtaposition. This operation associates to partial functions f, from 
(Z*+)” to (Z*)", i=1,...,, the function (ff,...,f,) from (ZtY” to 
(Zt+)"' x - ++ X(Zt)* which is defined as follows: 

Df... A= DAN +> A D(K)s 

Cheats Gee Eh CO aes ei oxen Oca 


(c) Recursion. This operation associates to a pair of partial functions f 
from (Z*)" to Z* and g from (Z*)"*? to Z+ the partial function A from 
(Z+)"*! to Z*+ which is defined by recursion on the last argument: 


h(x), -.-,%» I= f(xy... . X,) (initial condition); 
A(x... %s K+ 1) =2(%,---5% sk ACY, ..-, x, «)), fork 21 
(recursive step). 


The domain of definition D(h) is also defined by recursion: 


CXjpar td, te PE DIS Cust eye DP): 
(X02 25% kK+FDED(A)S CX, ...,%,k>E D(A) and 
CX p03 25 My Ky AK ong ek) ED (ge). fork > 1: 
(d) The p-operator. This operation associates to a partial function f from 


(Z+)"*+! to Z* the partial function A from (Z*)” to Z* which is defined as 
follows: 


D(A) = (Ky, 2 Ha > LSC Xp X41) = Land 
KXq, ++ > Xp KD E D(f) for all k < x,4,}5 
A(x, «++ 5 Xa) = min{ x4 | fOCp - + Se Xai) = HY. 


The general role of » is to introduce “implicitly defined” functions, as is 
often done in many areas of mathematics. Three remarks about the 
definition of should be made at this point. First, we obviously chose the 
minimal y with f(x,,..., x,,)) = 1 in order to ensure that the function A 
is single-valued. The second observation is that, at first glance, it might 
seem that the domain of definition of A is artificially narrow. If, for 
example, we have f(x,,...,X,.2)= 1 and f(x,,...,x,, 1) is not defined, 
then we have taken A(x,, ..., x,,) to be undefined, rather than equal to 2. 
This is done because we want to preserve intuitive semicomputability in 
going from f to hf, as will be discussed in somewhat greater detail below 
(see 2.7(a)). 

Finally, we note that all the operations before p, if applied to every- 
where defined functions, give an everywhere defined function. This is 
obviously not the case for p. Thus, » is the only one of the operations 
which causes partial functions to arise unavoidably. 
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2.4. Definition. 

(a) A sequence of partial functions f,,..., fy is called a partial 
recursive (respectively primitive recursive) description of the function 
Iv =f if 

i belongs to the family of basic functions; 


f, i 22, either belongs to the family of basic functions, or else is 
obtained by applying one of the elementary operations 
(respectively one of the elementary operations other 
than p) to certain of the functions f;,..., f,_-1. 


(b) A function f is called partial recursive (respectively primitive 
recursive) if it admits a partial recursive (respectively primitive recur- 
sive) description. 


(The analogy with the definition of a deduction in a formal language 
immediately catches our attention, and can sometimes be of use.) 


2.5. Church’s Thesis (usual form) 
(a) A function f is semi-computable if and only if it is partial recursive. 
(b) A function f is computable if and only if both f and x p:p are partial 
recursive. 


Remark on terminology. Everywhere defined partial recursive functions 
are also called general recursive functions. If the domain of definition is 
either clear or not essential in a given context, we simply use the term 
“recursive.” (Note that every primitive recursive function is general recur- 
sive.) 


2.6. Use of Church’s thesis. Before discussing in detail the arguments 
supporting Church’s thesis, we indicate how it is used in practice in 
mathematics. Two basic applications are especially evident in the litera- 
ture. 

(a) Church’s thesis used for a definition of algorithmic undecidability. 
Suppose we have a countable sequence of mathematical “problems” 
P), P>,... . Further, suppose that each problem has a “yes” or “no” 
answer, and that the conditions in P, are written out “effectively” as a 
function of n. Such a sequence P = (P,) is called a “mass problem.” We 
associate to such a problem a function f from Z*+ to Z*: 


D(f)={i€2Z*|P, has “yes’’ for an answer}; 
f@=1, iff DC). 
A mass problem P is called algorithmically decidable if the functions f and 
Xvi ate partial recursive. Otherwise P is called algorithmically undecid- 


able. We also distinguish the case when only x p,p is not partial recursive 
from the case when even f is not partial recursive. The second type of 
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undecidability is worse than the first; we saw examples of this in §1. 
Finally, a whole hierarchy of “degrees of undecidability” can be rigorously 
defined and investigated. 

A well-known example of a mass problem is the problem of word 


identities in groups. Let G be a finitely defined group, and let a,,...,4a, € 
G be elements. A “reduced word” in a),...,a, is an expression of the 
form a+ - - aj*, where k > 1,6 = +1, ande, = 64 whenever j =i). We 


number all the reduced words and ask the question P,: “Does the mth 
word represent the unit element of the group G?” The ‘mass problem” 
(P,,) turns out to be algorithmically decidable for certain groups G and 
elements a,,..., 4, and algorithmically undecidable for others (Novikov, 
Boone, Higman). The function f in this case is always partial recursive, but 
X pcp IS not always (see Chapter VIII). 

For another example of an undecidable problem, this one connected 
with Diophantine equations, see Chapter VI. 

(b) Church’s thesis as a heuristic principle. The intuitive notion of 
“semi-computability” at first seems broader than the notion of “partial 
recursiveness,” and many problems concerning partial recursive functions 
become much easier if we replace the conditions in the problems by 
informal ideas and allow such ideas to be used to solve the problems. For 
example, the formula e=lim(1+(1/z)) and the Euclidean algorithm 
make it intuitively clear that the functions f, g : Z* >Z* given by 


J (n) = the nth digit in the decimal expansion of e, 


g(n) = the nth prime number 


are computable, but the verification that they are recursive requires rather 
painstaking constructions. 

Church’s thesis allows us to solve such problems in two stages: (1) 
finding an informal solution using any intuitive algorithms we need; and 
(2) formalizing the solution. The second stage presupposes a certain 
proficiency in finding a partial recursive description for a wide variety of 
semi-computable functions, and Church’s thesis assures us that such a 
description exists. 

As proofs of recursiveness become more and more numerous in the 
literature, it becomes increasingly common to go through only the first 
stage of the solution; a striking example of this is Hartley Rogers’ book, 
Theory of Recursive Functions and Effective Computability (McGraw-Hill, 
New York, 1967). We shall also take such liberties toward the end of this 
book. All the same, there is a certain danger in this practice. It is possible 
that the habit of increasingly using informal arguments delayed the dis- 
covery of such a fundamental fact as the result that recursively enumerable 
sets and Diophantine sets coincide. 
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2.7. Arguments in support of Church’ s thesis 

(a) First of all, the basic functions clearly must be computable, no 
matter how we precisely define the notion of computability. Furthermore, 
when the elementary operations are applied to semi-computable functions, 
they again give a semi-computable function. A program to semi-compute 
the latter function can easily be put together from the programs which 
semi-compute the original functions. We shall only consider the case of the 
p-operator in detail, leaving the simple construction of the other three 
programs to the reader. 

In the notation of 2.3(d), let f be a semi-computable function from 
(Z*)"*! to Z*. In order to compute hA(x,,...,x,), we go through the 
vectors (x1, X3,---5X,, 1, (xy,.--, X,, 2>,-.. in the order of increasing 
last coordinate, and compute the values of f at these vectors. If 
(X1,.++,%,> € D(h), where A is obtained from f by applying the p- 
operator, then the program for f successively computes 


FA Xie 5 8 Davis St ies yh ey = 1); 


and finally f(x,,..., x,y) = 1. The least such y, if it exists, must be given 
as output; it will be the value of A at the point (x,,..., x,>. On the other 
hand, if it turns out that one of the values f(x,,...,,, k) (before we 
reach f = 1) is not defined, then either the program which semi-computes f 
will work infinitely long, or else it will give an answer not in Z*, which 
must then be given as output. But then, by definition, /# is not defined at 
the point (x,,..., x, >, and the behavior of the program for A still agrees 
with the definition of A being semi-computable. 

From all this we conclude that partial recursive functions are semi-com- 
putable. However, the stronger part of Church’s thesis is the converse: 
semi-computable functions are partial recursive. (The definition of computa- 
bility in terms of semi-computability is simply taken from §1 without any 
changes.) As has been said, this result is an experimental fact. The 
experimental evidence for it is divided into several classes, which we 
consider in (b)(d) below. 

(b) In the literature we find a huge collection of recursive descriptions of 
various computable and semi-computable functions. See, for example, 
Rozsa Péter, Recursive Functions (Academic Press, New York, 1967). We 
shall give part of this list in the next section. We also find certain 
techniques for composing recursive descriptions which are applicable to 
entire classes of (semi-) computable functions. Every time an author has 
tried to find a partial recursive description of a (semi-) computable 
function, he has met with success. 

(c) Turing proposed a mathematical characterization of an abstract 
computer, and gave strong arguments to the effect that this computer is 
universal, i.e., it can (semi-) compute any (semi-) computable function. His 
arguments came from a detailed analysis of the characteristic features of 
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determinate computational processes. (We again recall that we have not at 
all concerned ourselves with formalizing computational processes, but only 
with the results of such processes.) It turned out that the class of functions 
which are semi-computable by Turing machines exactly coincides with the 
class of partial recursive functions. 

(d) Church, Post, Markov, Kolmogorov, Uspenskii, and others have 
proposed other determined schemes for processing information of a gen- 
eral (not necessarily number theoretic) character. In all cases it has turned 
out that if the sets of input and output are numbered in a suitable 
“effective” way, these methods lead to a class of maps from Z*t to Z+ 
which coincides with some subclass of the partial recursive functions. 

For further discussion of Church’s thesis, we refer the reader to the 
literature; see, in particular, S. Kleene, /ntroduction to Metamathematics 
(Van Nostrand, New York-Toronto, 1952). 


3 Basic examples of recursiveness 


3.1. In this section we give a short list of recursive functions and a selection 
of basic techniques for proving recursiveness. Both these lists will subse- 
quently be enlarged when needed (in particular, see Chapter VII). 


3.2. (a) sum, : (Zt)? >Z*, <x,,.xb Xx, + x. 
Use recursion on x, starting from the initial condition 
x, + 1=sum,(x,, 1) = suc(x,) 

and applying the recursive step 

x, +k +1=sum,(x,, k + 1) =suc(sum,(x,, k)). 

(b) sum, : (Z*)">Z*, <x... xb DL x, n 23. 
Suppose that we already know that sum, _, is recursive. We can obtain 
sum, by juxtaposition and composition as follows: 
sum, = sum,°(sum,_,°(pr/",..., pr,"_,), pr,”). 


Another version is to use recursion on x,, starting from the initial condi- 
tion sucesum,_, and applying the recursive step 

n-l 

> x, +k +1 =suc(sum,(x,,..., %,—1 )). 

i=l 
This choice of recursive descriptions, even of “natural” ones, will become 
even more numerous as the functions become more complicated. 


3.3. (a) prod, : (Z*+)?3Z+*, (x4, be Xp xX. 


Use recursion on x,, starting from the initial condition x, and applying the 
recursive step 
x(k +1) = x,k + x, =sum,(x,k, x,). 
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(b) prod, :(Z*)">Z*, <x, ...,X%, 2b Xp x, 23. 


prod, = prod,°(prod,,_,°(pry’, ... 5 Ptr), Pt,)- 
34, (a) ZZt, xe xe1a[Xoh xe2 
1, ifx=l1. 
Use recursion with the functions 
f:(ZtyPs2zt,- 1 1; 
g=pr: (Z*Y 32", CX, XP b> Xy- 
(b) (Z+)P? Zt: 
‘ X,~— Xy, if xy > x2; 
hy ae he ee | a feces. 
This “truncated difference” is obtained by applying recursion to the 
functions 
f(x) =x, 7-1; 


8(X1, Xp, X53) = X37 1. 


3.5. F : (Z*)">2Z*, where F is any polynomial in x,,..., x, with integer 
coefficients which only takes values in Z*. 

If all the coefficients in F are nonnegative, then F is a sum of products 
of the functions pr,” : (x),...,%,>}1> x,. Otherwise, we write F= Ft — 
F~, where F* and F~ have nonnegative coefficients, and at all points of 
(Z*)" the nontruncated difference coincides with the truncated difference 
F* + F~ because of the assumption concerning F. 

We shall often use the recursiveness of the function (x, — x,)? + 1, or 
h=(f-—g)? +1, where f and g are recursive. This technique allows us to 
identify the set on which f = g with the “level set of h at 1,” i.e., the set on 
which h = 1. 


3.6. “Step functions”: for each a, b, x9 €Z*, the function defined by 


ne a, forx < Xx, 
*o b, for x > Xp. 
If x9 = 1, we obtain this function by recursion with initial value a@ and all 
the succeeding values 5. In the general case we set 

See (ey Sep’ (x 1S a): 
3.7. rem(x, y) = the remainder in [1, x] (since we cannot use zero!) when y 
is divided by x. 

We have: 
rem(x, 1) = 1, 

l, if rem(x, y) = x; 


rem(x, y+1)= : 
(oy +) sucerem(x, y), if rem(x, y) # x. 


187 


V Recursive functions and Church’s thesis 


We now apply a somewhat artificial technique. We consider the step 
function s = 57"! ie., s(1) = 2 and s(x) = 1 if x > 2, and we set 
(x,y) = s((rem(x, y)- x) + 1). 

Obviously, 

rem(x, y) # xo(x, y) = 1, 

rem(x, y) = xo(x, y) = 2, 
so that 

rem(x, y + 1) =2 suc(rem(x, y)) ~ o(x, y) suc(rem(x, y)). 


This gives a recursive definition of rem. 
We next describe this technique in a more general form. 


3.8. Suppose A is defined by “recursion with conditions,” i.e., 


BGs ane TPO Behe eee kt 1) 
= BAX, tse Xp k, h(x, re aoe Xns k)), 
if the condition C,(x,,...,%,,k, 4) holds, i=1,...,m, where the ex- 


haustive and mutually exclusive conditions C, are given in the form 
C, is fulfilled>$,(x1,.... Xp Ks AC © Xp K)) = 1, 


with ¢, an everywhere defined recursive function which only takes the 
values | and 2. Then we can write the recursive step as follows: 


Welsh, RAV EHO DS oie rk oD) 


i=] 
m 
PD (agian ss Rak hese) 
i=l 
This device allows us to show that the following functions, which will be 
needed later, are primitive recursive: 


_ { the integral part ofy/x, ify/x > 1; 
a atx, y) | 1, Pipe 
We have: 
qt(x, 1) = 1: 
qt(x, y), if rem(x, y + 1) # x; 
qt(x,y +1) = 5 qt(x,y) +1, if rem(x,y +1)=x andy +1#x; 
1, ify +1=-x. 
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We reduce the conditions to the standard form 3.8 using the functions 
§ ((rem(x, » +1)—x)+ 1), 
s((rem(x, y +1)—x) + 1)-5((x —y-1)+1), 
s((x —y-1y+ 1), 

where 5 = s}'? and §= 57). 


3.10. rad(x) = the integral part of Vx . 


We have: 
rad(1) = 1, 
rad(x), if qt(rad(x) + 1, x + 1) <rad(x) + 1; 
rad(x + 1) = 
rad(x) +1, if qt(rad(x) + 1, x + 1) =rad(x) +1. 


The reduction of these conditions to the standard form 3.8 will be left to 
the reader. 


3.11. (a) min(x, y): 
min(x, 1)=1, 
i ifx < y; 
min(x, y+ 1l)= mat): : Tee 
min(x,y)+1, ifx>y. 
(b) max(x, y): analogous. 


3.12. If f(x), ..., X,) is recursive, then 


x= > Fie mace)? und BS ll fee) 
=] k=] 


are recursive. 


In fact, 
SPX yank phy oar te LS ST Geist tot ae ies ge FD), 
Ph (xy 0 9-0 1) = Pek ey PO ew oe De 
3.13. If f(x), ..., X,) is recursive, then so are the functions obtained from f 
by: 


(a) any permutation of the arguments; 

(b) adding any number of “dummy” arguments; 

(c) identifying the elements of any subset of the arguments 
(f(x, x) instead of f(x,y), and so on). 


In fact, all of these functions can be obtained from f and the various 
pr?” using composition and juxtaposition. 
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3.14. A map f : (Z*)" > (Z*)" is recursive if and only if all of its components 
pr;"°f are recursive. 

This is obvious. 

In conclusion, we note that all the specific functions described above 
are primitive recursive, and that all the above general operations, when 
applied to primitive recursive functions, yield primitive recursive functions. 
Starting in the next section, we shall make essential use of the y-operator, 
which was defined in 2.3(d). 


4 Enumerable and decidable sets 


4.1. Definition. A set E C(Z*)" is called recursively enumerable if there 
exists a partial recursive function f such that E = D(f) (the domain of 
definition of f). 


The discussion in §1 and §2 showed that recursive enumerability has the 
following intuitive meaning: there exists a program which identifies the 
elements x in E but which might not identify the elements not in £. Later, 
in 4.12 and 4.18, we shall give another intuitive description of recursively 
enumerable sets which is more closely related to the etymology of the 
name: these are sets all of whose elements can be obtained using a suitable 
“generating” program (perhaps with repetitions and with no indication of the 
order in which the elements occur). 

The concept of a recursively enumerable set occupies a central place in 
the theory of computability, alongside the concept of a partial recursive 
function. It will later be clear, in particular from Proposition 4.15, that 
either of these concepts can be reduced to the other one. However, only by 
using both ideas together do we obtain the flexibility necessary for efficient 
proofs. 

We begin with the following simple fact. 

Recall that the /evel set at m (or simply the m-level) of a function f from 
(Z+)" to Zt is the set E c D(f) such that 


xE€ Eof(x)=m. 


4.2. Proposition. The following three classes of sets coincide: 


(a) Recursively enumerable sets. 
(b) Level sets of partial recursive functions. 
(c) Level sets at 1 of partial recursive functions. 
(a) C(c). Suppose that E is recursively enumerable, so that E = D(f), 
where f is partial recursive. Then E =the 1-level of the function 
[Mo f. 
(b) =(c). The m-level of f coincides with the I-level of (f— my +1. The 
function (f — m)* + | is partial recursive whenever f is, by Proposi- 
tion 3.5. 
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(c) C(a). Suppose that E is the 1-level of a partial recursive function 
F(X, .. +5 X,). Set 


DKises b-04 x,) = min{ y|(f(x,, oie XQ) IP ty = L} 
Obviously, g is partial recursive and FE = D(g). O 


The following much more difficult assertion, along with its corollaries, 
constitutes the central result of this section. 


4.3. Theorem. The following two classes of sets coincide: 


(a) Recursively enumerable sets. 
(b) Projections of level sets of primitive recursive functions with values in 
Zt. 
4.4. FIRST PART OF THE PROOF. We first recall that, if we are given a set 


E c(2Z*)'*”, then its projection (“onto the space of the first n coordi- 
nates”) is the set F C (Z*)" which is defined as follows: 


Cys ho Sk 


SAY, Vm E (ZT), CX pd 14s Vc a Dey GE. 


(From this point on, we shall not adhere to the practice in Part I of using 
different notation for “variable coordinates” and for particular values of 
the coordinates.) We similarly define the projection “onto the coordinates 
with indices (i,,...,4,)C(1,...,2+m).” The number m is called the 
codimension of the projection. The canonical map E—F (as well as its 
image) is also customarily called a projection, but this is not likely to cause 
any confusion. 

For the time being we shall call projections of level sets of primitive 
recursive functions primitive enumerable sets. The first part of the proof 
consists in showing that primitive enumerable sets are recursively enumer- 
able; the second part consists in verifying the converse implication. 

Thus, let f(x)... 5 X,3 %:415 +++» Xn4+m) be a primitive recursive func- 
tion, and let E be the projection of its 1-level onto the first n coordinates. 
(We need only consider I-levels because of the consideration used once 
before: the k-level of f coincides with the 1-level of f’ =(f—k)> + 1.) We 
explicitly construct a partial recursive function g such that E = D(g). 

We distinguish three cases, depending on the codimension of the projec- 
tion: m=0, m= 1, and m > 2. 

Case (a): m=0. Then £ = the I-level of fe>£E is recursively enumer- 
able, by Proposition 4.2 (where g is constructed explicitly). 

Case (b): m= 1. Let 


g(x,---,%,) = min{x,. |S (xp)... 5 Xe Xpai) = 1}. 


Obviously, g is partial recursive, and D(g) = E. (Notice that we have used 
here the fact that D(f) =(Z*)"*t!) 
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Case (c): m>2. We reduce this case to the previous one using the 
following lemma, which is also important in many other situations and is 
of interest in its own right (as a statement that there is no notion of 
dimension in “recursive geometry”). 


4.5, Lemma. For each m > | there exists a one-to-one mapping t"” :Z*+ > 
(Z*)" such that 


(a) The function { = pr/"°t™ is primitive recursive for all 1 <i < m. 
(b) The inverse function 7™ : (Zt)"—>Z?* is primitive recursive. 


4.6. How the lemma is used. Suppose that the lemma is true. We apply it to 
the situation in case (c) in 4.4 as follows. For m > 2 we set 


B(X yp ss Ke VV HLH oo os ae PP) «sO (Y))- 


Obviously, g is primitive recursive if f is. It is easy to see that E coincides 
with the projection of the 1-level of g onto the first n coordinates. Since 
this is a projection of codimension 1, we have reduced this case to the 
previous one. 


4.7. PROOF OF THE LEMMA. The case m = | is trivial. We use induction on 
m, Starting with m = 2. 

Construction of t®. We first construct +r® : (Z*+)?—+Z* explicitly by 
setting 


T(x, X2) = $((x, + x5)’ —x,—-3x,+ 2). 


It is easy to see that, if we list the pairs <x,, x.> €(Z*) in “Cantor order,” 
ie., according to increasing x, + x, and, among those with given x, + x, 
according to increasing x,, then 7(x,, x,) will be precisely the index of 
the pair <x,, x,> in this list. Thus, 7 is a one-to-one correspondence and, 
moreover, is primitive recursive (where we use Proposition 3.5 and then the 
recursiveness of qt in 3.9 to take care of the 3). 

The calculation of the pair <x,, x.> as a function of its index y is an 
elementary problem, and results in the following formulas for the inverse 
function 1: 


Py) =y—$[yar-F -3]([yar-F -2] +1) 
1PXy) = [yar - 5] - 1) +2. 


Here [z] denotes the integral part of z. The verification that these functions 
are primitive recursive using the results (and techniques) of §3 is left to the 
reader as an exercise. 

Construction of t”, m > 3. Suppose that 1°"~)) and 1"~ have already 
been constructed with the required properties. We first set 


die rere) Aaa (Calle © carer Pome) Rare 


192 


4 Enumerable and decidable sets 


It is clear that 7“ is one-to-one and primitive recursive. Solving the 
equation 77°" )(x,, ..-. Xm—1)s Xm) = in two steps, we obtain the 
following formulas for the inverse function 1°: 


te) = Py), 
L(y) =" M1M(y)), 1<i<m—1. 


The ¢ are primitive recursive by the induction assumption. This com- 
pletes the proof of the lemma, and by the same token the first part of the 
proof of Theorem 4.3. O 


SECOND PART OF THE PROOF. We must now show that every recursively 
enumerable set is primitive enumerable. We begin with the following 
property of the class of primitive enumerable sets. 


4.8. Lemma. The class of primitive enumerable sets is closed with respect to 
the following operations: finite direct product, finite intersection, finite 
union, and projection. 

Proor. Let FE, E’c (Z*)" and E, c(Z*)” be three primitive enumerable 

sets which are projections of the I-levels of the primitive recursive func- 

tions f, f’, and f,, respectively: 


RECN 6 tg HEEL SAH Op» f(x,y) = 1, 
RAK Ks Lose k, On SAS hz iek ep, f(x, z)=1, 
uU= CU, ..-,Um_> © EL; Srv= Cv), ..., 0s Sf\(u, v) = 1. 
We then have: 
E X E, =a projection of the |-level of the function 
g(x, us y, 0) = f(xy) fi v); 
E U E’ =a projection of the 1-level of the function 
g(xsy,z)= (f(x, »)— DFO, 2)- D +1; 
E - E’ =a projection of the I-level of the function 
B(x5 ¥, Z)= f(x, vy) f'(%, 2). 


Closure with respect to the projection operation is clear from the defini- 
tion. Lemma 4.8 is proved. Oo 


Now let E be a recursively enumerable set. We realize E as the 1-level 
of a partial recursive function f from (Z*)” to Z* using Proposition 4.2, 
and we note that, to prove that E is primitive enumerable, it suffices to 
show that the graph T, C (Z*)" x Z* of f is primitive enumerable. In fact, 
it is clear that E =the l-level of f= the projection of the set 
ra [(z*)" x {1}] onto the first n coordinates. Here the set {1} CZ* is 
primitive enumerable (for example, by 3.6) so that, if we prove that I’; is 
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primitive enumerable, it will follow from Lemma 4.8 that the same is true 
for E. Thus, our problem is finally reduced to the following form: we must 
prove that the graph of a partial recursive function f is primitive enumerable. 
To do this we verify that first, the graphs of the simplest functions are 
primitive enumerable; and second, if we apply any of the elementary 
operations to functions having primitive enumerable graphs, then the 
resulting function also has a primitive enumerable graph. 
Graphs of the basic functions 


Toye C (Z*) = the 1-level of (x, + 1 — x3)? +1, 
Tye» Cc (Zit! = the 1 -level of Xap 
To. C(Z*)"*! = the 1-level of (x; — x,44) + 1. 


Stability under juxtaposition. Let f and g be partial functions from (Zt )” 
to (2*) and (2*)%, respectively. Suppose that I, and I, are primitive 
enumerable. Then I, ,, C(Z*)” x (Z*) x (Z*)* coincides with the inter- 
section 


(T, x (Z*)*) 9 perm(T, x (2*)?*), 


where perm : (Z*+)” X (Z*)? x (2+)? 3(Zt)” X (Zt) x (Z*)? is the oper- 
ation of permuting the last two factors: 


CxO), yD, ZV Ey Kx), 2), yD, 


It is clear from Lemma 4.8 that I; ,, is primitive enumerable. 

Stability under composition. Let g be a partial function from (Z*)”" to 
(Zty", let f be a partial function from (Z*)” to (Zt+)!, and let h = feg. 
Then [T,=the projection of the set (I, x (z*)’) n ((z*)" x BA onto 
(Z*)" x (2*). As before, if T, and [, are primitive enumerable, then so is 
I, by Lemma 4.8. 

The stability relative to recursion and the p-operator is much subtler. 
We shall need the following elegant and useful lemma. 


4.9. Lemma. There exists a primitive recursive function Gd(k, t) (Gédel’s 
function) with the following property: for any N €2Z* and any finite 
sequence a,,...,ay €Z* of length N, there exists t€Z* such that 
Gd(k, t) = a, for all 1 < k < N. (In other words, the function Gd allows 
us to consider integers as encoding arbitrarily long sequences of in- 
tegers: Gd(k, ¢) is the Ath member of the sequence encoded by 7, and 
the existence assertion assures that each sequence has an encoding.) 


PROoF. We first set 
gd(u, k, t) =rem(1 + kz, wv) 


and show that gd has the same property as Gd if we are allowed to choose 
<u, t) © (Z+). Once we show this, we can set Gd(k, y) = 
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gd(7(y), k, (y)), where 42 :Z+—(ZtY is the isomorphism in 
Lemma 4.5. (It is not really essential to remove the extra parameter in 
gd(u, k, t), but working with Gd(k, ¢) will make some of the formulas 
shorter.) 

Thus, suppose we are given a,,..., a, €Z*. We first choose X €Z* 
so as to satisfy ¥ > N and 1+ kX!> a, for all 1 < k < N. We then set 
t= X'!, It is easy to see that, if k, #k, and k,,k, < N, then 1+k,X! and 
1+,X! are relatively prime, since any common divisor would have to 
divide (k, — k,)X'!, i.e., would have to consist of primes < X, but no such 
prime divides 1+ k,X!. 

By the Chinese remainder theorem, there exists a solution u € Z* of the 
system of equations 


u =a, mod(1 + kX!), 1<k<Nn. 
It is then obvious that 
gd(u, k, t) =rem(1 + kt, u) = a, l<k<Nn. Oo 


We now continue with the proof of Theorem 4.3. 


4.10. Stability relative to the p-operator. Let f be a partial function from 
(Z+)"*! to Zt, and let 


g(x,,..-,x,) =min{ y| f(x, -.-, X%¥) = 1}. 

Recall that the domain of definition of g consists of those ¢x,,..., x,) for 
which such a y exists and <x,,...,x,, k> € D(Jf) for all k less than the 
least such y. We want to prove that, if I’, is primitive enumerable, then so 
is T,. 

& 

Suppose that I’, is the projection onto the first n + 1 coordinates of the 
1-level of a primitive recursive function F: 


b= f(x. -- +s Mat) 
SAC 44, yt eee eee a 


(where ¢ has been used to denote the argument of F which becomes the 
value of f). As in 4.4, it suffices to consider the case m = 1, since, if m > 2, 
then we can use Lemma 4.5 to replace the vector (y),...5¥m) by a single 
y, and if m = 0, then we can introduce a “dummy argument” y on which F 
does not actually depend. 

Thus, let m=1. We introduce a function G of the arguments 


Xj,-++3Xp ¥>), f, & by setting s(1) = 2, s(x) = 1 for x > 2, and 
F,, = F(x, ...5 Xp k, Gd(k, t), Gd(k, t,)), k >, 
G= F(X, ...5%% iy) Tl s(Gack, t)) Fy. 
k= 
Here [I?_, = 1 by definition. It is easy ‘ see that G is primitive recursive, 
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since it is obtained by recursion on y from two other functions which are 
obviously primitive recursive. We shall show that I’, is the projection of the 
1-level of G onto the coordinates (x), ... , X;) Y)- 

The inclusion pr(G = 1) CT,. Let (x1, ..., XY ¥, t,t) be a point in 
the I-level of G. We must verify that <x,,...,x,>€D(g) and that 
Y= 2(x),...,,). In other words, we must show that 


f(%,---5%e Y= 
F(X, -+-5%,.k) is defined and >1 forallk <y—-1. 


Since G = 1 at the given point, it follows that all the factors in G equal 1 
there. In particular, F(x,,...,.%,, y, 1,y)= 1, which implies that 
S(%,,--.,%,> Y) = 1, because [ , 1s the projection of the I-level of F. If 
y = 1, there is nothing more to be proved. 

Suppose y > 1. Since the Ath factor in the product [I{z} equals 1, we 
obtain: 

s(Gd(k, t)) = 1=Gd(k, t) > 2, 
F, = 1=>Gd(k, t) = f(x), ..., x, k) 2 2, 
as required. 

The inclusion T, C pr(G = 1). Let <x,,..., x, y> ET. We must choose 
values for the remaining coordinates y, ¢, and ¢, in such a way as to make 
all the factors in G equal to 1. 

First of all, (x),...,%, y, > ET, by the definition of g. We find the 
necessary value of y by lifting this point from IT; to the I-level of F. If 
y = 1, we may choose arbitrary values of ¢ and f,. 

Suppose y > I. We then find ¢ from the system of equations 


Gd(k, t)=f(x,,..-,x, «), foralll<k<y-1. 


(Here the right side exists by the definition of D(g).) 
Finally, for each k < y — 1 we lift the point 


KX, ..+5X,, k, Gd(k, 1)> © lr, 


to a point on F = | having additional coordinate y“, and then we find ¢, 
from the system of equations 


Gd(k, )=y, 1<k<y-1. 


This makes all the factors in []i=', equal to 1. In fact, s(Gd(k, #)) = 1, since 
Gd(k, t) = f(x, ...,%, 4A) 22 for k<y-—1, and, finally, F, = 
F(x), 0. +5 X_5 k, Gd(k, 1), Gd(k, t,)) = 1 by the definition of tand¢,, (J 


4.11. Stability relative to recursion. We now carry out the last step in the 
proof of Theorem 4.3. 

Let f and g be partial functions of n and n +2 variables, respectively, 
and let / be the function of n + 1 variables which is obtained from f and g 
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using recursion: 


h(x, ---5 Xp» LI =f (x... 5 Xp) 
WA Kix a ee eA) Sele sacs ee Ops OO): 
We must show that, if I, and I, are primitive enumerable, then so is I,. 

Let F and G be primitive recursive functions whose I-levels project onto 
I, and T’,, respectively: 

=f (x1, ...,%,) dy, F(x... Xp GY) =I, 

¥=2(X),-- +> Xp4 2) SZ, G(X... Xen YE Z=L 
where, as in 4.10, it suffices to consider the case when the projection 
codimension is 1. 

We shall explicitly construct a function H whose |-level projects onto 
l,. H will be a function of the arguments x,,...,X,41 7, 4 ¢, (where 
is the argument which becomes the value of h). We set: 

F(1) =1, 5(x)=2, for x > 2; 
G, = G(x,,...,%,,k — 1, Gd(k — 1, 4), Gd(k, t), Gd(k, t))); 


Xn+1 


H=F(x,,...,x,, Gd(l, 1), y)-5 | (m — Gd(x,,,,0) + 1| diem 
k=2 


(We take [[~t,=1 if x,,,= 1.) As in 4.10, we easily verify that H is 
primitive recursive. 

The inclusion pt(H = 1) CT,. Let (x),.--, X41 7), 6,0 be a point 
on H = 1. We must show that A(x), ..., X,41) = n- Since the second factor 
in H equals 1, we first obtain 7 = Gd(x,,,, ¢). If we also have x,,,= 1, 
then setting the first factor in H equal to | gives: 

n = Gd(1, 1) =f(xy,,---, x,) = A(x... Xp 1). 
Now suppose x,,, > 1. In this case, using the equation G, = 1 we find 
that for all2<k<x,4, 
Gd(k, t) = g(x,,.. 65 Xk — 1, Gd(k — 1, 4), 
and using the equation F = | and the definition of h we find that 
Gd(1, 1) =f(x,,..-,%,) =A(x, ---, %» 1). 
If we increase k from k = 1 tok = x,,, and use the recursive definition of 
h, we see by induction on k that Gd(k, t)=h(x,,...,x,,) and, in 
particular, 
9 = Gd( x, 415 0) = AC «Xn Xn) 
The inclusion T,, C pr(H = 1). We are given a point 
Xi oe ey Xnay A(x), sey Xai) ET,. 
We let 1 = A(x), .- - , X,4 1). We must also choose values of y, ¢, and 7, so 
as to make H equal to 1. 
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If x,4,; =1, we choose ¢ so that Gd(l,)=A(xy,...,%x,, D= 
f(X),...5 %,). We then lift the point (x), ..., x,, Gd(1, t)> ET; to a point 
on F'= |. This gives us the value of y; t,; may be chosen arbitrarily. 

Now let x, > 1. We first find ¢ from the system of equations 


Gd(1, 1) = f(x), ..., %) = A(X, -- 5. Xe D5 
Gd(k, t) = h(x), ..., X45 k) =2(%),...,X_, K — 1, Gd(k — 1)), 
X2<k <x, 4). 
We then find y by lifting the point <x,,...,x,, Gd(I, 2)> ET, to the 


1-level of F. This makes the first two factors in H equal to 1. 
We next lift the points 


KX oes Xp kK — 1, Gd(k- 1,1), Gd(k, DET, = 25k < Xp 4, 


to the I-level of G by adding coordinates z“, and then solve the following 
system of equations for ¢,: 


Gd(k,t)=2, 2<k< x4). 


This makes the G, factors in H equal to 1. 
The proof of Theorem 4.3 is complete. oO 


4.12. Explanation of the term “recursively enumerable set.” Theorem 4.3 
shows that, if E is recursively enumerable, then there exists a program 
which “generates” E (see 4.1). In fact, suppose E is the projection onto the 
first n coordinates of the 1-level of the primitive recursive function 
F(x}, ..+5%X,,¥). The program that generates E must run through the 
vectors (x),..., X,,», say in Cantor order, compute f at each vector, and 
give <x,,...,%,> aS output if and only if f equals 1 (compare with 
Corollary 4.18 below). Unlike programs of the type described in §1, which 
can become stuck forever on an element not in £, a generating program 
sooner or later gives us any given element of £, and nothing other than 
such elements. However, if E is empty, we might never find this out. 

We conclude this section by discussing the properties of the so-called 
decidable sets. Intuitively, E Cc (Z *)" is decidable if there exists a program 
which for every element of (Z*)” tells whether or not it belongs to E. 


4.13. Definition. A set FE C(Z*)" is called decidable if both it and its 
complement are recursively enumerable. 


In §5 and in the next chapter we show that there exist sets which are 
recursively enumerable but not decidable. This result is closely connected 
with Gédel’s incompleteness theorem, which is the subject of Chapter VII. 
4.14. Theorem. The following three classes of sets coincide: 


(a) sets whose characteristic function is recursive; 
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(b) level sets of general recursive (i.e., everywhere defined partial recur- 
sive) functions; 
(c) decidable sets. 


Proor. The relations (a) = (b) and (b) C(c) are obvious from what has 
already been proved. It thus remains to show that (c) C (a). 

Let E C(Z*)" be a decidable set, and let E’ be its complement. By 
definition, E = D(f) and E’ = D(/’) for certain partial recursive functions 
f and f’. We may even assume that f=1 and f’=2 (where they are 
defined). We consider T, UT C(Z*)" X Z*. This union is obviously the 
graph I’, of the characteristic function g of the set E. It is clear from the 
proof of Lemma 4.8 that I, is recursively enumerable whenever I, and Ty. 
are. Hence, the partial recursiveness of g is implied by the following result, 
which is also of independent interest. 


4.15. Proposition. In order for a partial function g from (Z*) to Z* to be 
Partial recursive, it is necessary and sufficient that its graph YT, be 
recursively enumerable. 


PRooF. Necessity has already been proved. 
We verify sufficiency. Since I’, is recursively enumerable, there exists a 


primitive recursive function G(x,,...,X,, Y, 2) (see 4.10) such that T, = 
the projection of the I-level of G onto (x,,..., x,, y). We set 
H (xy. 265 Xp U) = G(x, -~., x, Hu), (Pv), 


where up» <t(u), 1(u)> is the primitive recursive isomorphism Z+ > 
(2+) described in 4.5 and 4.7. H is obviously primitive recursive. Finally, 
we set 

A(x,,.--,%,)=min{ulH (x,,...,x,,u) = 1}. 


This is a partial recursive function whose domain of definition coincides 
with D(g) and which easily allows us to compute g: 


2(X,---,%,) = tP(A(x,..., x,))- 
Thus, g is partial recursive, and the proof of Proposition 4.15 and Theorem 
4.14 is complete. O 


4.16. Corollary. Every partial recursive function g has a description in which 
the p-operator is only applied once. 


4.17. Corollary. Every partial recursive function g which is everywhere 
defined has a description g,,..., 8) = g in which all the functions g, are 
everywhere defined. 


In fact, the description whose last part (starting with G) was constructed 
in 4.15 has this property. 
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4.18. Corollary. The class of nonempty recursively enumerable sets coincides 
with the class of sets of values of primitive recursive functions. 


In fact, the set of values of a function f is a projection of the graph of f. 
Conversely, let E C(Z*)” be a nonempty enumerable set that is the 
projection onto the (x,,...,x,)-space of the |-level of a primitive recur- 
sive function f(x,,..., X,, y). Let (e,,..., &,> be an arbitrary member of 
E. Then E coincides with the set of values of the primitive recursive 
function 


CHP (z), TP (z)), 


a(z)= C€j, + 3 Sn rs 


if f(ef"tY(z), ©. (2), AM) = 
if not. O 


4.19. Corollary. 


(a) Finite sets and their complements in (Z*)" are decidable. 
(b) Every partial function from (Z*Y" to (Z*)" with a finite domain of 
definition is recursive and computable. 


In fact, the one-point set {a} CZ* is a level for a suitable sum of two 
step functions, and its complement is a level for another such sum. 
Decidability is preserved under finite union and intersection, so we have 
(a) for n = 1. Then the isomorphism 7 allows us to infer this result for all 
n. 

This also implies (b), since the graphs of the mappings in (b) are finite, 
and therefore enumerable. 


5 Elements of recursive geometry 


5.1. Let E C(Z*)” be an enumerable set. We consider the structure on E 
given by the following data: 


(a) & ={E'|E’ cE, E’ is enumerable}. 
(b) For every E'€ &, R(E) ={f|D(f/)= Ef: E’2* is recursive}. 


We let ® = the set of pairs <E’, R(E’)>, EE &. 

We shall show that the structure {&, “.} has much in common with the 
structure “a topological space together with a sheaf.” This allows us to find 
natural interpretations for certain well-known results about enumerable 
sets, and to ask new questions suggested by analogies with other geometri- 
cal theories. 

We begin with some simple observations. 
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5.2. & is a lattice, i.e., it is closed with respect to finite unions and 
intersections. 

Since & is not closed with respect to arbitrary infinite unions, we 
cannot consider & as the system of open subsets of E in some topology. 
Nevertheless, in subsection 5.9 below we show that & is stable with respect 
to an important class of infinite unions. We shall say that & determines a 
quasitopology on E (which has properties similar to those of Grothendieck 
topologies, but does not satisfy all the axioms of the latter). 


5.3. Let E’, E” © & and E' CE”. Then the restriction of functions to E' 
gives a mapping R(E”)>R(E’): fo fle. 

In fact, let cz, ER(E’) and cz, =1 on E’. Then f|,. = fe, is recursive 
whenever f and c,. are. 


5.4. Let E’= Ui, E,, where E', E,, © &. Suppose that the f,, © R(E,) and 
are compatible on intersections: 


Vi,j <x, fleng = flan: 


Then there exists an (obviously unique) function f © R(E’) such that Wk < 
n, fle, = Se 

We need only verify that f @R(Z’), since there obviously exists a 
function f : E’->Z* which is “glued together” from the f,. But the graph 
of f is the union of the finitely many enumerable graphs I , CE’ x Z*, and 
so is itself enumerable. We then use Proposition 4.15. 

The results 5.3 and 5.4 allow us to consider ® as a sheaf on the 
quasitopology &. 


5.5. Let E, and E, be enumerable sets, and let f : E,-» E, be a recursive 
function. Then f induces a morphism of the corresponding quasitopologies with 
sheaves in the following sense: 


(a) If E’ C E, is enumerable, then f~'(E’) C E, is enumerable. 
(b) For every E’ C E5, composition with f determines a mapping 


Sf, : R(E’) >R(f~'(E’)). 


The first part follows because c/-\~ = Cg°f is recursive whenever cz, 
and f are; the second part is obvious. 

One might get the impression that the pair <&, &> completely char- 
acterizes E independently of the imbedding EF c (Z+t)”. However, this is 
not the case. 


5.6. Proposition. Let E, and E, be enumerable infinite sets. Then there exists 
a bijection f: Ej E, such that f and f~' are (partial) recursive. f 
induces an isomorphism <&,, R2,>—><&, Ry». 


ProoF. We establish the following more precise facts: 
(a) If EF C2Z* is infinite and decidable, then there exists a general 
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recursive bijection f : Z* + E for which f~! is (partial) recursive and is an 
increasing function. The converse is also true. 

(b) If E CZ* is infinite and enumerable, then there exists a general 
recursive bijection f : Z* > E with f~! (partial) recursive. 

First suppose E is decidable, and let g(x) =2 for x © E, g(x) =1 for 
x €E, andh=c,.. We set 


x=l 


y 2 
f(z) = win DS g(x) -y- | +1= | = the zth element of E. 
It is easy to see that 


as an element of E, if x € E; 
is not defined, otherwise. 


ro=(3 ee 


y=l 


is equal to the index of x 
re 


Now suppose £ is enumerable. By Corollary 4.18, there exists a primi- 
tive recursive function g : Z*+ — E whose image coincides with E. We shall 
adjust g so that it becomes bijective. We set 


F={k €2Z*|Wi<k, (i) #8(k)}. 


This set is decidable, since it is the I-level of the following primitive 
recursive function h: 


k-1 
ny=1; ak) = IL s((g)-g(®) +1), fork > 2; 


_/1, forx >2, 
ste) fe for x =1. 


By the previous result, there exists a recursive bijection 9’ : Z* > F. Let 
f= 2°". Since g|,- : FE is a bijection, it follows that f : Z* 3 E is alsoa 
bijection. The inverse function is partial recursive because 


fo (x)= min{ y|(f() — x) + 1 = 1}. 
The proposition is proved. oO 


Because of this result we usually consider the imbedding E Cc (Z*)” to 
be an essential element of the structure on E. In particular, we call £, and 
E, isomorphic if there exists a bijection between them which is induced by 
a recursive bijection of the ambient spaces. 

The complete classification of enumerable sets up to isomorphism is not 
known, but many subtle results have been obtained in the theory of 
“reducibilities.” We shall only go so far as to show, using a theorem which 
will be proved in the next chapter, that not all enumerable sets are 
decidable. 
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5.7. Families. Suppose that m > 0 and B is a set. By a family of m-sets (or 
an m-family) over the base B we mean any mapping B>?((Zt)”). If 
E, C (Z*)” is the image of k € B under this mapping, we also denote this 
family by { E,}. We call the set E = {<x, k>|x € E,} C(Z*)” X B the total 
space of the family. 

Similarly, we call a mapping B— {partial functions from (Z*)” to Z*} 
a family of m-functions over the base B. We call the function f : <x, k> 
b> f(x) for x © D(f,) the total function of the family. 

A family of m-sets (resp. m-functions) is said to be enumerable, if 
B c(Z*)" for some n, and if the total space is enumerable in (Z+)” X 
(Z*)” (resp. the total function is partial recursive on (Z*+)” X (Zt)"). 

If { £,} is enumerable, then the set {A € B|E, is nonempty} is enumer- 
able, since it is a projection of the total space E. Each of the E, is 
enumerable, since it is the intersection E  (Z*)” X {k}. 

Similarly, if {f,} is enumerable, then the set {k © B|f, is not the 
nowhere defined function} is enumerable, since it is a projection of the 
domain of definition of the total function f. Each of the f, is partial 
recursive, since it is the restriction of f to the enumerable set D(f) m (Z*)” 
xX {k}. 

If {f,} is an enumerable family of m-functions, then {D(f,)} is an 
enumerable family of m-sets (with total space D(f)), and {f,} is an 
enumerable family of (m + 1)-sets (with total space I',, or more precisely, 
I, after a permutation of its factors). 

An enumerable family {£,} (respectively { f,}) is said to be versal if 
every enumerable m-set (resp. any partial recursive m-function) is among 
the elements of the family. [(The word “versal” is borrowed from algebraic 
geometry, after removing the prefix “uni” which would indicate that each 
term in the family could only occur once.)] In §8 of the next chapter we 
show that versal families exist for each m. This is one of the central results 
of the theory, since total spaces and total functions of versal families are 
the starting point for practically all investigations of undecidability. Here 
we limit ourselves to the simplest and most fundamental application: 


5.8. Theorem. Let {E,} be a versal family of |-sets over the base B C2*. 
Then the set 


F=(k|kE&} 


is enumerable, but is not decidable. 


Proor. Let E CZ+ X Zt be the total space of the family. Then F = the 
projection of E - (diagonal in Z* x Z*) onto the first factor, and therefore 
is enumerable. _ 

On the other hand, for every k © B we have F=Z+*\ F# E,, since k 
belongs to either F or E,, but not to both. Since { £,} is a versal family, F 
cannot be enumerable. The theorem is proved. oO 
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We now show how to use enumerable families to strengthen the results 
in 5.2 and 5.4. We return to the notation at the beginning of the section. 


5.9. & is closed with respect to taking the union of the elements of any 
enumerable family of subsets of E. 

In fact, suppose that {£,} is such a family and E’ is its total space, 
where E’ c (Z*)” X (2+). Then 


) Ej= the projection of E’ on (Z*)”. 
kEB 


5.10. Suppose that {f,} is an enumerable family of partial functions on 
E, Ex= D(f,), EF’ = Uzeg Ey, and 


ViJEB,  fleng =fleng- 


Then there exists a unique function f © R(E’) which is glued together from 
the fy. 

In fact, the graph TI, is enumerable, since it is the union of the 
enumerable family of enumerable sets I, . 


5.11. After these remarks it is natural to consider the following system of 
ideas by way of analogy with the theory of spaces with sheaves. 

(a) Let E= U,.,&, be a covering of an enumerable set by an 
enumerable family. Then for any n>1 the family {E,,.9---:  E,| 
<k,,...,k,>€B"} is also an enumerable covering of £. In fact, let 
E’ = the total space of {E,} C E X B, and let 


Bd Rigen Mai isc yp pS Et = We Siy t 
mw E’xX-+- X E'(n times). 


Then the total space of the family {E,.9-+-- M £,} is isomorphic to 
(diagonal in E”) x B"Q E”. 

(b) Using the same notation, we define the “recursive product” RII, Cc 
Tek... kee (Ey, - + > OE, ) as follows: RIlp = R(E), RII, = the set of 
enumerable families { fi... 5} over B” such that 


So 


(c) For every n > 0 we have the following boundary mappings: 


phon tea 


are > ER(E 2 apts NE), forn > 1. 
a7: Ri] ri]... i=1,...,n+1: 
(3; ( a * Te: Meares ko Yeu, cans hap 


al Ra re Ee pores hadlEyn NB, «, 


(Note that we really do not have 3;"(RII,,) C RIl,,,.;-) 
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It is possible to associate various types of “recursive Cech cohomology 
groups” of the covering U, ,£;, to the object 


ag et i> 
R]]=R(£) >R]], Se I,5 --- 
0 a} aed 


It would be interesting to study such cohomology groups. The result 5.10 
shows that this complex is “exact at the first term.” 

The reader should not find it hard to imagine what other geometrical 
concepts would look like in this context. In particular, it would be 
worthwhile to study the quotients of enumerable sets by enumerable 
equivalence relations. Higman’s theorem (see Chapter VIII) gives a char- 
acterization of groups in the category of such objects. 

We conclude by giving several results on the structure of & . Because of 
Proposition 5.6, we need only consider subsets of Z*; that is, we take 
& = {E"|E’ C2*, E’ enumerable}. 


5.12. Proposition. There exist enumerable subsets F C Z* having an infinite 
complement, such that for any infinite E © & we have F ) E#@, so that 
FO E is infinite. 


Such F are called simple. From a topological point of view they 
resemble dense open sets. 


Proor. Let {£,} be a versal family of l-sets over Z* with total space 
Ec2* XZ*. We set E’= E17 {<x, k>|x > 2k}. Since E’ is enumerable, 
there exists a primitive recursive function with image E’: 


8=(81,8)): ZTE’. 
Let h(k) = min{z| g,(z) = k}, let f(k) = g,(h(k)), and let F denote the set 
of values of f. F has an infinite complement, since f(k) > 2k. The intersec- 


tion of F with an infinite £, is nonempty, since any value of g,(z) when 
82(z) = k lies in E, 7 E’ # ©. The proposition is proved. oO 


5.13. Proposition. 
(a) The quotient lattice & /(finite sets) has nontrivial maximal ele- 
ments. 
(b) Every nonsimple enumerable set with an infinite complement is 
contained in such a maximal element. 
(c) There exist simple enumerable sets with an infinite complement 
which are not contained in any nontrivial maximal set. 


We refer the reader to Rogers’ book for the proof of these and many 
other results. 
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CHAPTER VI 


Diophantine sets and 
algorithmic undecidability 


1 The basic result 


1.1. In §4 of Chapter V we showed that enumerable sets are the same thing 
as projections of level sets of primitive recursive functions. The projections 
of the level sets of a special kind of primitive recursive function—poly- 
nomials with coefficients in Z*—are called Diophantine sets. We note that 
this class does not become any larger if we allow the coefficients in the 
polynomial to lie in Z. The basic purpose of this chapter is to prove the 
following deep result: 


1.2. Theorem (M. Davis, H. Putnam, J. Robinson, Ju. Matiyacevié, G. 
Cudnovskii). All enumerable sets are Diophantine. 


The plan of proof is described in §2. §§3-7 contain the intricate yet 
completely elementary constructions which make up the proof itself; these 
sections are not essential for understanding the subsequent material, and 
may be omitted if the reader so desires. 

In §8 we use Theorem 1.2 to prove the existence of versal families of 
enumerable sets and functions. Recall that in §5 of Chapter V this result 
was shown to-imply that enumerable sets exist which are undecidable, a 
fact we shall use in subsection 1.3 below. 

In §7, which stands somewhat apart from the rest of the chapter, we 
define the Kolmogorov complexity of recursive functions, establish the 
basic properties of this concept, and prove that the problem of computing 
the complexity is algorithmically undecidable. 

In Chapter VII the following corollary of Theorem 1.2 will be used in 
an essential way: enumerable sets are definable in L,Ar. In fact, by their 
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very definition, Diophantine sets are defined by formulas of the form 
dx,--- dx,(p), where p is an atomic formula. 

In the remainder of this section we describe the principal applications of 
Theorem 1.2: settling Hilbert’s tenth problem, constructing polynomials 
which take only and all prime number values in Z*, and so on. 


1.3. Hilbert’s tenth problem. Hilbert stated it as follows: 


“Suppose we are given a Diophantine equation with an arbitrary number 
of unknowns and with rational integer coefficients. Give a way in which it 
is possible to determine after a finite number of operations whether or not 
this equation is solvable in rational integers.” 


We show that the combination of Theorem 1.2, Theorem 5.8 of Chapter V 
(which follows from Theorem 1.2), and Church’s thesis implies that this 
problem is undecidable. 

First of all, any natural number is the sum of four integer squares 
(Lagrange). Hence f(x,,..., x,) =0 is solvable in (Z*)” if and only if the 
equation f(1 + Z#_,y3,..., 1+2%_,¥2) =0 is solvable in (Z)*”. Conse- 
quently, it is sufficient to show that the mass problem “determining 
whether or not there are solutions in (Z*)” (see subsection 2.6 of Chapter 
V) is algorithmically undecidable. 

Let E CZ* be an enumerable set which is not decidable. We represent 
E as the projection onto the t-coordinate of the 0-level of the polynomial 
F=f(t x1, ...,%,), where f € Z[t, x,,..., x,]- The equation f, = 0, ty€ 
Z*, has a solution if and only if ¢) € E. By the discussion in §2 of Chapter 
V, the corresponding mass problem for the family {f,} is algorithmically 
decidable if and only if the characteristic function of E is computable. But, 
by our choice of E, this characteristic function is only semi-computable. 

Thus, solvability in integers cannot even be determined algorithmically 
for a suitable one-parameter family of equations. The number of un- 
knowns in the equation, and, in general, the codimension of the projection 
in Theorem 1.2, can be reduced to 13 (Matijacevi¢, Robinson). The precise 
minimum is not known, although it is an interesting problem. 

Finally, it should be noted that the construction of a Diophantine 
representation for any enumerable set E is completely effective in the sense 
that, given a recursive description of f with D(f)=£ or of g with 
g(Z*) = E, we can write out the corresponding polynomial explicitly. The 
same holds for the construction of versal families, of an enumerable 
undecidable set, and so on. These are all constructive assertions, and not 
simple existence theorems. 


1.4. Polynomials which represent the prime numbers. The search for “explicit 
formulas” for prime numbers was a traditional occupation of dedicated 
number theory enthusiasts for many centuries. Euler found the polynomial 
x? + x +41, which takes a long series of only prime values. But it has long 
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been known that the set of values at integer points of a polynomial f in 
2[x\,.-.-,,] cannot consist entirely of prime numbers: for example, if p 
and q are two sufficiently large primes, then the congruence f = 0 mod pq 
can be solved (in infinitely many ways). On the other hand, the problem 
becomes solvable in the class of primitive recursive functions: the function 
{its the ith prime} is itself primitive recursive (see §1 of Chapter VII), 
but for trivial reasons. 

The nontrivial statement of the problem and the problem’s solution 
involve Theorem 1.2: the set of prime numbers is the set of all positive 
values at points in (Z*)" of a certain polynomial in 2[x,,..., x,] (or, if we 
prefer, n may be replaced by 4n; see the reduction step in 1.3). Matijacevicé 
showed that there is a suitable polynomial of degree 37 in 24 variables. 

This is actually a general result which has nothing to do with the 
specific properties of prime numbers: 


1.5. Proposition. Let E CZ* be a Diophantine set. Then there exists a 
polynomial g © Z[Xo,...,X,] such that E coincides with the set of 
Positive values of g at points in (Z*)"*). 


Proor. Let £ be the projection of the O-level of the polynomial 
SI (Xo, Xs - +. X,) onto the x,-coordinate. We set 


g= Xo| 1 Sf igs Misceaus, x,) |- 
Clearly, the positive values of g are precisely the elements of E. oO 
It remains only to use the fact that the set of prime numbers is 


decidable, and hence Diophantine by Theorem 1.2. 
The following sets are also sets of positive integer values of polynomi- 


als: . 
1.6. The sequences {1, 10, 100,..., 10%... } and {1, ee id (n 
times), .. . }. 


It is amazing that the values of the corresponding polynomials can drop 
to zero and below in neighborhoods of points where these values are so 
large. 


1.7. The Fermat set {n|n >2 and x" + y" +2" =0 is solvable in Z}. Thus, 
the variable n can be moved from the exponent to the coefficients of a 
Diophantine equation. 

1.8. The set {10e,, 107e,,..., 10",,... }, where e, is the ith digit after the 
decimal point in the decimal expansion of e (or 7 or V2 , or any other 
“computable” irrational number). 


1.9. The set of all partial fractions in the continued fraction expansion of e, or 


7, or v2 : 
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We recall that in the case of V2 it is not known whether this set is 
finite or infinite. 

These examples show that many number theoretic questions reduce to 
problems of the solvability of Diophantine equations. In Chapter VII we 
shall see that, in a certain sense, “almost all of mathematics” reduces to 
such problems. 


2 Plan of proof 


2.1. In this section we introduce some auxiliary notions and give the plan 
of proof for Theorem 1.2. 

We shall temporarily introduce a class of sets which are intermediate 
between enumerable and Diophantine sets. In order to define this class, we 
consider the map which to every subset E C(Z*)" corresponds the set 
F c(2t)’ which is given by the following rule: 


Cyn ,X,) € Fevk €[1, Hels 6 err: Send One oe 


We shall say that F is obtained from E by applying the bounded universal 
quantifier to the nth coordinate. We define similarly the operation of 
applying the bounded universal quantifier to any coordinate. 


2.2. Definition-Lemma. Consider the following three classes of subsets of 
(Z*)" for each n. 


(I) Projections of level sets of primitive recursive functions. 

(II) The least class of sets which contains the level sets of polynomials 
with integer coefficients and which is closed with respect to taking 
finite direct products, finite unions, finite interesections, projections, 
and applying the bounded universal quantifier. 

(IID) Projections of level sets of polynomials with integer coefficients. 


The following assertions hold for these classes: 


(a) The class (1) coincides with the class of enumerable sets, and the class 
(IIT) coincides with the class of Diophantine sets. We shall call sets in 
the class (IT) D-sets. 


(b) ()D (ID > (ID. 


PROOF. 

(a) In Theorem 4.3 of Chapter V we showed that the class of primitive 
enumerable sets coincides with the class of enumerable sets. The rest of (a) 
merely consists of definitions. 

(b) Only the inclusion (II) c (I) is not completely obvious. First of all, 
the m-level set of a polynomial f is the same as the |I-level set of the 
primitive recursive function (f—m)?+ 1. Hence, to verify (II) C(I) it 
suffices to show that the class (I) is closed with respect to (finite) direct 
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product, union, intersection, and the bounded universal quantifier. All 
except for the last of these were established in Lemma 4.8 of Chapter V. 

Finally, suppose F is the image of a primitive enumerable set E under 
the bounded universal quantifier: 


(Xp ee Mya Mr EG FOWKE Sx, (X,- 2 Mpa p OEE. 


Starting with the function f(x,,....X,—4) X%pi Vis - ++ s¥m) Whose 1-level 
projects onto E, we want to construct a function g whose I-level projects 
onto F. A natural idea is to consider as an approximation to g the product 


Xn 


Ds Piero tes pi Ke eas Se 
k=] 


where the y,, are “independent variables.” The only problem is that the 
number of arguments of this “function” increases with x,. To deal with 
this, we apply the Gédel function Gd(k, 1), which was defined in subsec- 


tion 4.9 of Chapter V. The function g will now depend on x,,..., x, and 
on m additional arguments ¢,,..., ,! 
BUR ie eka) 


A iene hee Cd iis, 3 oGde 15): 
k=1 


This function is primitive recursive, because the Ath factor is obtained 
from f and Gd by substitution and identifying arguments, and then g is 
constructed from such factors by recursion. 

We now verify that the set F is the projection of the |-level of g onto the 
<X},..., X,>-coordinates. In fact, if g(x,,...,¢,,) = 1, then for all 1 << k 
< x, we have f(x,,...,k, Gd(k, t,),..., Gd(k, t,,)) = 1, ie., for all 1 < 
k<x, the point <(x),...,x,-,,k> belongs to E. This means that 
CX fag ve F 


Conversely, if (x,,...,x,>€F, then for 1<k <x, we can lift the 
point ¢x,,...,%,—1,k> to the I-level of f. Let the y-coordinates of the 
resulting point be y, ,,..-5¥m,%- We solve the following system of equa- 


tions for the ¢;: 
Gd(k, 1,)=y,,, forall] <k < x,. 


This is possible by the fundamental property of Gd. The resulting values 
for the z,, along with x,,...,x,, make g equal to one. This completes the 
proof of Lemma 2.2. C] 


2.3. The plan for the rest of the proof of Theorem 1.2 is as follows. In §3 
we show that the classes (I) and (II) coincide, and in §§4—7 we show that 
CD and (IID coincide. 


2.4. Remark. In the course of proving Lemma 2.2, we obtained the 


210 


3  Enumerable sets are D-sets 


following facts, which should always be kept in mind in what follows: 


(a) In the definitions of the classes (I){II]) we may always replace “level 
sets” by “1-level sets” (by going from f to (f— m)? + 1). 

(b) All of the classes (I-{II}) are closed with respect to (finite) products, 
intersections, unions, and also projections. (The proof of this for the 
class (I) in Lemma 4.8 of Chapter V is also applicable to the class 


(IIT).) 


We encounter much greater difficulty in treating the bounded universal 
quantifier. Indeed, the most technical part of the proof in §§4-7 is 
concerned with showing that the class of Diophantine sets is closed with 
respect to the bounded universal quantifier. 


3 Enumerable sets are D-sets 


Let f: (2*)"—2Z?t be a primitive recursive function. Its 1-level can be 
represented as the projection onto the first n coordinates of the set 
rn Realy x { 1}], where I’; is the graph of f. Thus, an enumerable set 
can be obtained as a projection of the intersection of the graphs of two 
primitive recursive functions. Since, by definition, the class of D-sets is 
closed with respect to projections and intersections, the assertion in the 
title of this section follows from the following fact: 


3.1. Proposition. The graphs of primitive recursive functions are D-sets. 


Proor. The graphs of the basic functions are Diophantine. The stability of 
the property of graphs “being D-sets” relative to the composition and 
juxtaposition of functions is verified by the same arguments as in the proof 
of Lemma 4.8 of Chapter V. It remains to prove the stability under 
recursion. We shall first of all need information about the graph of Gédel’s 
function. Here it is more convenient to use gd instead of Gd. 


3.2. Lemma. The graph of the Godel function gd(u, k, t) = rem(1 + kt, u) is 
Diophantine, and a fortiori, a D-set. 
Proor. The set 
Tyg = (<u, k, t, y>|y is the remainder when u is divided by 1 + kr} 
is the intersection of the following two sets in (Z*)*: 


Ey, :y<ltkt; 
E,:u—y20 and is divisible by 1 + kz. 
Both E, and E, are Diophantine. In fact, E, is a projection of the 0-level of 


the polynomial 2 + kt — y — y,, and E, is a projection of the 0-level of the 
polynomial u — y — (1 + Af)\(y»2 — 1). The lemma is proved. oO 
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3.3. Corollary. Let f and g be functions of n and n+ 2 arguments, respec- 
tively, whose graphs are D-sets. Then the following equations determine 
D-sets in the (x), ..., X41) 4, t, ...)-coordinate space (where any addi- 
tional coordinates may follow the t): 


| Seal A) Pa Gd Saat armenia mf 
Fs gd(u, Xpe1 th = 8 xy - +s Xp B44, Xper D). 


Proor. Introducing extra coordinates after the ¢ amounts to taking the 
direct product with (Z*)’, and this, of course, takes D-sets to D-sets. 

E can be represented as a projection of the intersection of the sets 
gd(u, k, t)=w, f(x,,..., X,) = w, and k = 1 (where & and w are auxiliary 
coordinates). Since [4 and I; are D-sets, the same is true for E. 

Similarly, F can be represented as a projection of the intersection of the 
sets 


gd(u, x,4,+1,2)=w, 
Bd(u, Xne 1» 1) = Wo, 
BX +s Xp ety We) = Wy 
These are D-sets, because I’, and I’,, are D-sets. oO 


3.4. PROOF OF PROPOSITION 3.1. Recall that it remains to verify the 
following assertion: Let A be the function defined recursively from func- 
tions f and g by the equations 


A(x, --- 5X I= f(y, - 5 X)s 
A(x, 0-2 Xs KE 1) = B(x,, os Xpr Ky ACs» Xp DS 
then the graph [, 
(Xp ees May D ETN = Ax, ---, Xn41) 


is a D-set whenever the graphs I’; and I, are D-sets. 
First step. We set T, =T'UT?, where x,,,=1 on I! and x,,, > 2 on 
T?. Since 
(x5 4 BaD Cl Sx 5, mb and: WH Pgh sas Be) 
it follows that I"! is the intersection of [, x Z* and a D-set, arid therefore 
is a D-set. It remains to verify that I? is also a D-set. 


Second step. \n the (x,,..., X41, 7, u, t)-coordinate space we consider 
the sets 


Ey: n= gd, X41 9s 

E,: gd(u, 1, t) = f(x)... X,)s 

Ex: %,4,>1, gd(u, k, 1) =2(x,,...,%,k—1, gd(u, k- 1, 2)) 
forali2<k< x,,). 
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It is easy to see that I? = pr) ate In fact, as in §4 of Chapter V, we 
obtain inclusion in one direction by comparing £, and £, with the 
inductive definition of A, and in the other direction by suitably choosing 
the parameters u and ¢ in Gédel’s function. Thus, it remains to show that 
the E, are D-sets. 

Third step. E, is the graph of gd with some additional coordinates. E, 
was shown to be a D-set in the proof of Corollary 3.3. 

Finally, E, is “almost” obtained from the set F in Corollary 3.3 by 
applying the bounded universal quantifier to the x, ,,-coordinate. More 
precisely (for brevity, we ignore the y-coordinate): 


Opies tan th DE EgeV kh C[2, X45 | Oper gape Le Oe F 


SVE lx gp Leen ee OEP. 


Consequently, if we apply to F the bounded universal quantifier in the 
X,+,-coordinate, we obtain a D-set which is the same as £, with the 
x, 4,-coordinates of all its points decreased by 1. So it remains to see that 
the operation of shifting back by I preserves the property of “being a 
D-set,” and this follows easily from the definitions. The proof is complete. 


O 


4 The reduction 


4.1. The next three sections are devoted to proving that the class of D-sets 
coincides with the class of Diophantine sets. As noted at the end of §2, it 
suffices to show that the class of Diophantine sets is closed with respect to 
the bounded universal quantifier. 

Let f(x, . ~~. X%_» Ks Vs --+>¥m) DE any nonconstant polynomial with 
integer coefficients. f will be fixed for the duration of this section. Let d be 
the degree of f, and let c be the sum of the absolute values of its 
coefficients. 

We define the set E by the condition 


KX, 260 Xp YO EC LESWK < y ACY, Vins 
RAC rere: eer oe mes ee et 8 


We want to show that E is Diophantine. In this section we prove the 
following reduction step, which is due to Davis, Putnam, and Robinson. 


4.2. Proposition. E is Diophantine if the following three sets are Diophantine: 


Xp = Xz; 
X= X!5 
x) X4/x 
=| *3/ %4 
cay) ie ’ X3 2 X4X5, 
Xy Xs 


where (j) =n(n—1)-- + (n—kK+1)/k! is the “binomial coefficient.” 
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The proof of this and all subsequent propositions of this type follows a 
standard pattern. To show that EF is Diophantine, we introduce auxiliary 
sets E, with the following properties: 


N 
(a) E=() &; 


i=l 
(b) the E; are Diophantine. 


But usually we are not able to establish directly that all the £; are 
Diophantine, so we apply the same procedure to certain of the £;. Thus, 
the proof that E is Diophantine has a tree-like pattern. 

The exposition of each step will consist of the following stages: the 
introduction of auxiliary variables, which disappear when we project; 
explicit construction of the sets £,; the proof of the inclusion E Cpr 
(1 ,£;; and the proof of the inclusion E D> pr 1", E,- 


4.3. PROOF OF PROPOSITION 4.2. We denote the auxiliary variables by the 
symbols Y, N, K, Y,,..., ¥,, We introduce the sets £, in the 


m* 


(Xy,--+5%X%_s¥, Y, N, K, Y),..., Y,,>-space by the following relations: 
Bee NEMS Ro). YS Vides Re 


(intuitively speaking, the right side of the first inequality gives a rough 
estimate for the value of the polynomial f at the point <x),...,%,,¥, 


Vy--++>Pmo if ally; < Y). 


¥ 
E,: 1+ KN!= {[ (1+kN!) 
k=l 
(this is a “large modulus”; f=0 will be replaced by divisibility by this 
modulus). 
Eu2 f(xy eK Vesey Ye) S0 modt! + KN"); 


E;,;. Il (¥,-/)=0 mod(1+ KN!), i=1,...,m. 


i<y 
We define the set E’ as "PE. 
PROOF OF THE INCLUSION £ Cpr E’. Given a point (x,,...,X»y> EE, 
we must choose values for the other coordinates so that the relations 
E,,..., E43 are fulfilled. 
By the definition of £, each point <x,,...,x,, A>, k < y, can be lifted 


to the 0-level of f: 


FX q, 0 Xn he Vins os Yak) = 0. 
For Y we take the maximum of y and the y,,. Then, as before, we find the 
Y, and N by solving the system of Godel equations 


gd(¥,k,N!)=y,, forall<k<y. 
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The proof of Gédel’s lemma shows that the Y, and N may be taken 
arbitrarily large: in particular, so as to satisfy £,. The number K is 
uniquely determined by £,. 

All the choices have now been made. The relation £,,; holds, because, 
by the definition of Y, and gd, we can find a number Y, —/ with j < Y, 
namely j = y,,, such that Y, — 7 =0 mod(1 + kN!), for every k < y. Hence, 
the product on the left in Z;,, is divisible by all the 1+ kN!, 1<k< y, 
which are pairwise relatively prime, since N > y by £,. Therefore, this 
product is divisible by 1 + KN!. 

Finally, to verify Z£, we note that E, implies the congruence K = 
k mod(1 + kM!), 1< k < y, because (1 + KN!) —(1+ kN!) =0 mod(1 + 
kKN!). But then, since y,, = Y; mod(1 + KN!) by our choice of Y,, we find 
that 


PRs ct 5 May By Yipee 6 eS F Oi KK ass 205 Denk) 
=0 mod(1 + kN!), 


Since the moduli | + AN! are pairwise relatively prime, this congruence 
implies E,. 


PROOF OF THE INCLUSION pr E’ C E. Given a point 


CX as Be Pr TN, KY ge YD 
whose coordinates satisfy the relations E,,..., £43, we must find a 
vector (Viz, --+>»Yme> for each k < y such that 
F(%q3 06 +3 Xno Ko Vins «+ +s Vk) = 0. 


To do this we let p, denote any prime divisor of 1 + AN!, and we set 
Vig = the remainder when Y, is divided by p,,. 


We claim that these y,, give us the required equality. In fact, E, implies 
that f(x), --- 5, Xp. Ks Vins» + +> Ye) =O mod p,. It suffices to show that the 
number on the left is less than p,. We have 


DP, divides U (¥,-J) by £34; 
j<y 
=p, divides Y;—j forsomey < Y 
= y,, = the remainder when Y, is divided by p, < Y 


=> fly, 6 Xs Ky Ves = = Voc) S CCK, VY) SN <p, 
where the second inequality in the last line follows from £,, and the third 
inequality follows because p, divides 1 + kKN'!. 


CONCLUSION OF THE PROOF. It remains to show that the sets E),..., E43 
are Diophantine if the sets in Proposition 6.1 are Diophantine. In fact, if 
we trivially introduce new variables and make substitutions, we can first 
reduce the verification that all the Z, are Diophantine to showing that the 
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following sets are Diophantine: 


X= Xq)5 

x,= [I (1+kx,); 
k< x2 

x, = Il (x,—J), Xz > X3. 
IS X3 


It then remains to notice that the second of these relations can be written 
in the form 


and the third relation can be written as 


— ytf%27 1 > 
xX, =x;,! x, | Xy > X34. 
This completes the proof of Proposition 4.2. oO 


5 Construction of a special Diophantine set 


5.1. In this section we begin the proof that the three sets in Proposition 4.2 
are Diophantine. In order that the reader may better appreciate this stage 
in the proof, we mention that the most troublesome obstacle here is the 
rapid growth of one of the coordinates in comparison to the others (for 
example, x, = x,!). J. Robinson had the following key idea. She proved 
that, if we know that any specific set in (Z+)’ is Diophantine and has one 
coordinate which grows faster than any power of the other but slower 
than, say, x* (for example, exponentially), we may then conclude that all 
enumerable sets are Diophantine. After this, Matijacevié and Cudnovskii 
were able to show that a certain set of that type (connected with Fibonacci 
numbers) is Diophantine. For a history of the question, see Matijacevic’s 
article “Diophantine Sets” in Uspehi Mat. Nauk, vol. XXVII, No. 5 (1972) 
(translated in Russian Math. Surveys). 

In this section we give a construction which is an improved version of 
the original construction. Its idea is based on the following observation. 
Let x? — dy? = 1 be Pell’s equation (where d € Z+ is not a perfect square). 
Its solutions <x, y> € (Z*)* form a semigroup with composition law 


(x, +y,Vd )(x,+9.Vd ) = X3 + y3Vd 2 
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This is a cyclic semigroup. That is, let <x,,y,> be the solution with the 
least first coordinate. Then any other solution has the form <x,, y,>, where 
n€2Z*, and 


Xp, +y,Vd = (x, +y,Vd ite 


We call n the number of the solution <x,, y,>- 

The coordinates x, and y, grow exponentially with n, so that the set of 
solutions of Pell’s equation, and also the projections of this set on the x- 
and y-axes, are Diophantine sets having logarithmic density. This is not yet 
enough: we still have the problem of including the solution number 2 
among the coordinates of a Diophantine set. Only then can we apply 
Robinson’s technique. This is what will be done below. 


5.2. Notation. We consider Pell’s equation with variable d. Its first solution 
generally varies as a function of din an uncontrollable fashion, so that it is 
convenient to choose only those d whose first solutions have the simple 
special form <a, 1), a © Z*. Obviously, then d= a? — 1. 

We shall call the equation x? — (a? — 1)y? = 1 the a-equation. We define 
the two sequences x, (a) and y,(a) as the coordinates of its nth solution: 


x,(a) + y,(a)Va?—1 =(a+Va?—-1)’. 


For each n, a formal definition of x,,(a) and y,(a) as polynomials in a can 
easily be given by induction on n. Then the expressions x,(a) and y,(a) 
will make sense for all n €Z and a EC. In particular, 


x(I=1, yl) = 75 


and all the formulas given below remain true. The basic result of this 
section is 


5.3. Proposition. The set 
E: y=y,(a), a>l 
in the <y, n, a)-space is Diophantine. 


The proof uses the elementary number theoretic properties of the 
sequences x,(@) and y,(a), most of which will be verified at the end of the 
section (see 5.8). The idea for determining n in a Diophantine way from 
<y, a> is to observe that y,(a) =n mod(a — 1) (Lemma 5.4). This uniquely 
determines n as long as n<a-—1I. To pass to the general case, we 
introduce an auxiliary A-equation with A large, and find formulas for its 
nth solution (using y) in which n only appears in a Diophantine context. 

Formally, the proof that E is Diophantine follows the pattern described 
in 4.2. In addition to the basic variables y, n, a, we introduce six auxiliary 
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variables: x, x,,¥), A, X>, V2. We set: 
Ey: yen, a>; 
E,: x?-(a?-l)y?=1; 
E;: y,=0 mod 2x?y?; 
Ey: xt-(a?— yah: 
Es: A=a+t xi(xj- a); 
Eg: x3 —(A?~ Iy3= 13 
Ey: y,—y =0 mod x7; 
Ey: y,=n mod 2y. 


Let E’= (\3_,E,. We show that pr E’ = E. 

The inclusion E Cc pr E’. Given <y, n, a>) € E, we must find values for 
the other variables so that £,,..., £, hold. As before, we shall not 
introduce any new symbols for these values; after we choose, say, a value 
for x, the letter x will become the name for this value. 

E, is automatically satisfied: y,(a) > n for all a > 1, n > 1 (induction on 
n). We find x uniquely from E, : x = x,,(a). We take <x), y,/2x’y> to be 
any solution of the Pell equation X?— (a? — 1)(2x’y’¥? = 1; this gives 
E,. A is found uniquely from E;. We take <x,, y,> to be the nth solution of 
the A-equation. Now all choices have been made. To verify E, and E, we 
need two lemmas. 


5.4. Lemma. y, (a) = k mod(a — 1). 
5.5. Lemma. If a= 6 mod c, then y,(a) =y,(b) mod c. 


These lemmas will be proved in 5.8. 
We use these lemmas as follows. From E, we find: 
A=at(1+(a?— 1)y?)(1 + (a? — I) yi — a) = 1 mod 2y, 


because of E,. Lemma 5.4 then gives y, = y,(A4) =n mod 2y; this is E,. 
Lemma 5.5 gives y,(A) = y,(a) mod x? (because of E,); this is Ey. 


The inclusion pr E’ Cc E. From the relations FE), ..., Eg we have only to 
prove that n is the number of the solution <x, y>. Note that n only occurs 
in £3. 


For the time being we let NV, N,, and N, denote the numbers of the 
solutions <x, y>, (x), ¥,>, and <x, y>>, respectively. We shall prove that 
n=N or n=—WN mod 2y. 
Since we also have y > n (by E,) and y > N (by the definition of N), it 
follows that n= N, as required. The number N, will be the “stepping 
stone” to get from n to N. 
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First of all, as before, it follows from E, that A =1 mod 2y, and then it 
follows from the definition of N, and Lemma 5.4 that y, = N, mod 2y. But 
by E, we have y, =n mod 2y; hence, 


N,=n mod 2y. 


Next, 4 =a mod x? by E;, and then y, = yn fA) = = yy,(a) mod x? by 
Lemma 5.5. Using E;, we have y = yy(a) =y, mod x?. Hence, 


Yn (a) = yy, (a) mod Xf. 
We now need two more lemmas, which will be proved in 5.8. 
5.6. Lemma. /f y,(a) = y,(a) mod x,(a), where a> 1, then either i=j or 
i= —j mod 2n. 
5.7. Lemma. If y,(a)’ divides y;(a), then y,(a) divides j. 


If we apply Lemma 5.6 with N, N,, and AN, in place of i, 7, and n, and use 
the last congruence proved, we obtain: 


N = +N, mod 2N,. 


If we apply Lemma 5.7 with N and N, in place of i andj, and use E3, we 
obtain y|N,. Hence, 


N = +N, mod 2y, 


and, since we have already shown that N, = mod 2y, this completes the 
proof. O 


5.8. PROOF OF THE LEMMAS. We shall write x, and y, instead of x,(a) and 
y, (a). Using the formula 


Xnk + Vink V a ] =(x, +9, V a] VS 
we find that 


J<k J 
j= (mod 2) 


In particular, 
Yap = kx ky, mod(a? — 1), 
which gives Lemma 5.4 if we set n = 1. In addition, we have 
Yn = kx, 'y, mod y,. 
If we replace nk, k, and n by n, n/k, and k, respectively, we obtain 


= mpi 


y= Expl y, mod yp. 
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Since x, and y, are relatively prime, we have 
n 


k =0 mod y, >n=0 mod y,, 


y, =0 mod yg> 
which gives Lemma 5.7. 


If we write y,(a) as a polynomial in a with integer coefficients whose 
degree and coefficients only depend on n, we immediately obtain Lemma 
5.5. It remains to prove Lemma 5.6. 

First of all, the equation 


XpamtVae-l Vasc = (Ka + Va*-I1 ¥n)( Xm +Va?-I1 Ym) 
gives us 
Xnam = XyXm_ = (a? = ia Dap 
Ynsm = Xn ¥m + Xin Yn 
Hence, 
Vanem = Yns(nem) = Xnxm¥n MOd x, = + (a? — 1) yg¥m mod Xx, 
= +y,, mod x,, 
and, similarly, 
Vane m = Y2n+ (nem) = —V2nem Mod x, =Y +, MOd x,. 


This means that the class y, mod x, has period 4n as a function of k, and 
within [1, 4n] its behavior is determined by its values on the first quarter- 
period [1, 7]: 


Yrnem= Fm Vem = tim forl<me<n. 


If a > 3, it is clear that Lemma 5.6 follows from these facts and from the 
inequality y,, < ix, for 1 < m <n, which, in turn, follows because 


4y? <(a*- l)y2+1= x2. 


If a=2, then we only have y,, <}x, for m<n-—1, but this is still 
enough to complete the proof of the lemma in this case. Oo 


6 The graph of the exponential is Diophantine 
6.1. Proposition. The set 
E: m=a" 
in the <m, a, ny-space is Diophantine. 


Proor. It suffices to show that Ey = E 4 {ala > 1} is Diophantine. If 
a > 1, we easily obtain by induction on n 


(2a —1)"< yn41(a) <(2a)’, 
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in the notation of §5. Hence, for any N > | we have 


a”(1 ee \’- (2Na—1)" _ Yn4i (Na) (2Na)” 
2Na (2N)" Yna(N) — (2N—1)" 
ne eee (er a 
=< (1 2N ) ; 
Thus, if we choose N large enough so that both 
1 Ge 1 1] n 1 
(1 sw) I<ai and 1 (1 sia) <> 


then we obtain: a” =| y,41:(Na)/Vnai(N )] (where the brackets here and 
below denote the integral part of a number). Ep is therefore a projection of 
the set £): 

a>l, 

0< Yn+1(Na) —Yna1(N )m <a (N), 

N >?, 
where a suitable lower bound for N must be inserted in place of ?, in such 
a way as to keep the last relation Diophantine. An elementary calculation 
shows that it suffices to set N > 4n(y + 1). The results in §5 then imply 
that £, is Diophantine if we trivially introduce the auxiliary relations 


Y =Vnai(N) and Y” = Yn4 (Na). O 


7 The factorial and binomial coefficient graphs 
are Diophantine 
In this section we carry out the last series of arguments. 
7.1. Proposition. The set 
E: r=(7), nek 
0 PEC), oe 


in the <r, k, n>-space is Diophantine. 


Here, by definition, (J) =n(n—1)--- (n—k+1)/k!. We shall need 
the following 


7.2. Lemma. If u>n*, then ("%)= the remainder when [(u+ 1)"/u*| is 
divided by u. 


PRoor. We have 


n k-1 
n n i- n n '- 
(ut I" /ut= > (7) ‘+ (n)+ (7) . 
i=k+l i=0 
The first sum is divisible by u, and the last sum is less than lifu>n*. 


221 


VI Diophantine sets and algorithmic undecidability 


7.3. PROOF OF PROPOSITION 7.1. We introduce the auxiliary variables u and 
v, and take the relations 


E,: u>nt; 

Eo v=[(ut 1)"/u*]; 
E,: r=v modu; 

Exo r<u; 

Es: nok. 


Lemma 7.2 immediately implies that E = pr eee E, is Diophantine 


because of Proposition 6.1; EZ, E,, and E, are obviously Diophantine. It 
also becomes obvious that E, is Diophantine if we write E, in the form 


(ut1)"< ukv <(u+ 1)"+ u* 
and again use Proposition 6.1. This completes the proof. oO 
7.4. Proposition. The set E : m= k! is Diophantine. 
7.5. Lemma. If k > 0 and n > (2k)‘*!, then k!=[ n*/(Z)]. (This is proved 


by some simple estimates.) 


PROOF OF PROPOSITION 7.4. We take the auxiliary variable » and the 
relations 


EE, i asan*: 
Ess m=(n*/(7)| 


The rest is obvious (using Propositions 6.1 and 7.1). ‘a 
7.6. Proposition. The set 
. X2(P/4 
E: k ( ke ), p> qk, 
in the <x, y, p, q, k>-space is Diophantine. 


The proof which follows is a slightly more complicated version of the 
argument in 7.2 and 7.3. 


7.7. Lemma. Let a>0O be an integer such that a=0 mod q*k! and a> 
2?~'nk*!. Then 


P/q =a [ae*1(] ce git = af a*—"(1 + a~7)P/91, 
k 

This is proved using the binomial Taylor series for (1 + a~??/%. The 
inequality a > 2?~'p**! allows us to throw away all the terms in the first 
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sum starting with the (k + 1)th and all the terms in the second sum starting 
with the kth when we take the integral part. The congruence a= mod q*k! 
ensures that the partial sums are integers. oO 


7.8. PROOF OF PROPOSITION 7.6. We use the auxiliary variables a, u,, u), 
and v, and the following relations: 


E,: a=0mod q‘k!; 

Es * a2: 

E2 uy/u,=a'[a*"(1 4 a~2P/4); 
E,: o= ala! (1+a~7)?/41; 

Es: XU, =y(u, — vu). 


It follows from Lemma 7.7 that F = pr (1) - ,£;. E, and E, are immediately 
seen to be Diophantine from Propositions 6.1 and 7.1. E, and E, are 
shown to be Diophantine just as at the end of 7.3, except that this time we 
must raise the inequalities to the gth power after clearing denominators. E, 
is obviously Diophantine. 

This concludes the proof of Theorem 1.2, that enumerable sets coincide 
with Diophantine sets. oO 


8 Versal families 


Versal families were defined and first used in subsection 5.7 of Chapter V. 
The purpose of this section is to prove their existence, using the result that 
enumerable sets are Diophantine (Theorem 1.2). 


8.1. Theorem. For any m>0, versal enumerable families of m-sets and 
m-functions over the base Z* exist and can be effectively constructed. 


PROOF. We divide the proof into several steps. Recall that r® : (2*)??-3Zt+ 
is the primitive recursive isomorphism constructed in §4 of Chapter V, and 
<t@, 1 is its inverse. We shall write ¢, and 1, for brevity. 

(a) A versal family of polynomials in Z*[x,, X2, x3,...]. We define 
polynomials f[/] EZ*[x,, x, x3, ...] by recursion on / EZ*,/ > 4: 


f(T] =f[2] =/03] = 1 
f[ 4k] =k; 
f[4k +1] =x%,5 
f[4k +2] =ff i (A)] +f (4); 
f[4k +3] =f n(A] AO]. 


The definition is correct, since ¢,(k), t,(k) <4k + 2. The image of the map 
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kb» f[k] coincides with all of Z*[x,, x,, x3,...], since it contains Zt (in 
the 4k-places) and all the x, (in the 44 + 1-places), and, whenever it 
contains two polynomials f[k,] and f[k,], it contains their sum (in the 
47(k,, ky) + 2-place) and their product (in the 41(k,, k,) + 3-place). 
(Compare with the numbering of constructible sets by ordinals in Chapter 
Vv.) 

(b) Construction of a versal \-family over Z*+. Let E, be the projec- 
tion onto the x,-coordinate of the O-level of the polynomial 
f[(k)] —f[(k)]. Since all the elements of Z[x,, x,, x3,...] can be 
represented as such a difference, it is clear that the family { £,} contains 
all enumerable sets. 

(c) {£,} is enumerable. We must show that the total space E = {<i, j>|i 
€ E,} CZ* XZ* is enumerable. We write the condition i € E; in the form 
of an &, type formula, in which all the quantified variables take values in 
Z*. We use the fact that f[ t,(/)] —f[(/)] © Z[x,,..., x]. We have: 


ti, jf) E Esic E,oAx,... Ax(x, = iAf4G)] = f{2()]) 
@o41((Ax,... ax, VA <j ({[k] = Gd(k, 1) 
/AGd(5, 1) = i \Gd(t,(j), 1) = Gd(1,(J), 9), 


where Gd(k, rf) is Godel’s function (see §4 of Chapter V). Furthermore, by 
the definition of f[k]: 


Ax,- ++ dx, Wk <j (f[k] =Gd(k, 0) 
evk <j ((k <3 AGd(k, t) = 1) VAI((k = 41 \Gd(k, 1) = 1) 
VV (k =41+2 A Gd(k, t) = Gd(z,(/), 2) + Gd(t,(/), 1) 
V(k =41+ 3 AGd(k, 1) = Gd(z,(J), t) Gd(1,(7), 8)))). 


Here the part of the formula after 4/ defines a decidable set in <k, t, />- 
space. The quantifier 4/ projects this set onto the ¢k, ¢>-coordinates, 
thereby taking it to an enumerable set, and the bounded quantifier Wk < j 
preserves enumerability (see §2). Returning to the formula which defines 
E, we find that the set we have constructed so far must be intersected with 
two other decidable sets and then projected along the t-axis, so that the 
result is again enumerable. 

(d) Construction of a versal m-family over Z*+. The case m = 0 is trivial, 
and the case m= | has already been discussed. The case m > 2 reduces to 
the case m=1 using the isomorphism 7” : (Z*+)">Z*. In fact, let 
E, = Ef be a versal 1-family, and set Ef”? = (1)71(E{”). The family 
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{ E{”} is enumerable because 
E™ = {<x, |x © EL} = ((r™) (x), |x € EL} 
= (7, pri) EO, 
(e) Construction of a versal family of 1-functions. We take a versal 
2-family { E,{°} with total space . 
E® = {<x,y, EKKx, y> © EP} c(2ty’. 


Let g(x, y, k, z) be a primitive recursive function such that the projection 
of its 1-level onto the <x, y, k-coordinates coincides with E®. We set 


f(x, k) = #?(min{ ul g(x, 1(u), k, t(u)) = 1}). 


We claim that { f,.| f(x) = f(x, k)} is a versal family of 1-functions. The 
total function is obviously partial recursive. We need only verify that every 
partial recursive |-function f occurs in the family. 

Let I’, be the graph of f, and let I, = £, os where k, © Z*+. We show that 
f =f, In fact, 


(x, f(x)> ET, = EP ocx, f(x), ky E EO 
e@4z€Zt, 
g(x, f(x), kg, z) = 1. 

Among the z ©Z* which make g(x, f(x), kg, z) = 1, we choose the z for 
which the number u given by < f(x), z> = <1(u), t{(u)> is minimal. For 
this u we have f, (x) = ¢{?(u) = f(x), which proves the claim. 

(f) Construction of a versal family of m-functions. The case m=O is 
trivial. If { f{"} is a versal family of 1-functions, then for m > 2 we set 


SEP Cte 5 ty) = FOO Os © + = Mn), 


thereby obtaining a versal family of m-functions. 
The theorem is proved. oO 


8.2. The choice of versal families is far from unique. If m > 1, there does 
not exist a versal family which contains each function or each set exactly 
once (i.e., a universal family). Nevertheless, there are important methods of 
extracting invariant information from data about the position of a function 
or set in a versal family. The next section is devoted to this question. 


9 Kolmogorov complexity 


9.1. Let u = {u,} be an enumerable family of m-functions over Z*, and let 
J be a partial recursive m-function. We define the complexity of f relative to 
the family u as 
C= min{k|u,=f}, if such a k exists; 
oO, otherwise. 
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We call the enumerable family uw (asymptotically) optimal if, for any 
other enumerable family v, there exists a constant c, , >0 such that for 
every partial recursive m-function f we have 


CG, (f) < Cu oo (7) 


If we take v to be any versal family, we see that an optimal family must be 
versal, i.e., C,(f) never takes the value oo. 


9.2. Theorem (Kolmogorov) 


(a) For any m2 0, optimal families exist and can be effectively con- 
structed. 
(b) If u and v are optimal families of m-functions, then for any m-function 


Cou < C, (f)/C, (f) < Cu, vt 


9.3. Remarks 

(a) The measure of complexity C,(f) involves the following intuitive 
ideas. In order to define any enumerable family u, it is only necessary to 
give a finite amount of information, for example, a program which semi- 
computes the total function of u. Therefore, in order to define a specific 
function f which occurs in the family u, it suffices to give no more than 


log, C, (f) + const 


bits of information, namely, the program for uw and the number of f in u. 

(b) A family being optimal means that it can be used to compute any 
m-function, and that the loss in using it rather than any other family to 
compute a function is bounded by a constant which does not depend on 
the function. 

(c) Finally, the inequality 9.2(b), which follows trivially from the defini- 
tion of an optimal family, shows that, to within an additive term which is 
bounded in absolute value, the logarithmic measure of complexity 


K,(f) =| log, C,(f)] +1 (where “[ ]”=“‘integral part”) 


does not depend on the choice of the optimal family u, and so is an 
asymptotic invariant of f. 


9.4. PROOF OF THEOREM 9.2. We first choose a recursive imbedding 
6: 2Z* XZ*-»Z* which has a recursive inverse function and which satis- 
fies the following linear growth condition in one of its arguments: 


6(k,j)< k- (J), for all k, 7 €Z* and some suitable ¢ : Z* >2*. 


For example, we could let 6,(k, /)=(2k —1)2’ with $,(j)=2/*!, or, 
following Kolmogorov, we could let 


0(kiky ke. fa hah LOI k,, 
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where k,, jg © {0, 1} and the bar denotes the binary expansion of a 
number. Here @,(/) < const -j?, so that this function grows more slowly. 
(See also subsection 9.8 below.) 

Now let U be any versal family of (m+ 1)-functions. We define a 
family u of m-functions by setting 


U(X, Xm kK) = U(X, Xs 97 | (Kd). 


We show that the family u is optimal, with the following bound for the 
constants 


Cue < o(Cy (v)). 


In fact, let f be a recursive m-function. It suffices to consider the case when 
f occurs in the family v. Then 


F(X Xp) = OC «5 Xm COP)) 
= U(X, Xm CP); Cu (v)) 


= u(x), rs Mp BUCS Cy (v))), 
so that 


The theorem is proved. oO 


9.5. EXAMPLE. A O-function f can be identified with the single value it 
takes, i.e., with a positive integer n. In this case, Theorem 9.2 gives us an 
almost invariant complexity C,() for the integers. We have: 

(a) C,(n) < const -v for all n, since the function “n” appears in the nth 
place in the simplest versal family u,(-) = 2. 

(b) C(n) ~min{2/~'(2k — 1)|n is the kth value of the jth function in 
some versal family of 1-functions}. (We write f~g if f and g have the 
same domain of definition, and f < const-g and g < const-f for suitable 
constants. In relations of the type C,(f,) ~ g(x), we often omit the desig- 
nation of the optimal family u, which we take to be arbitrary, but fixed.) 

It is clear from (b) that the complexity of the numbers p, (the nth 
prime), n’, or 

n" (n times) 
as n> co is asymptotically no greater than const -n, since each of these is 
the nth value of a fixed recursive function. In 9.7(b) below, we shall lower 
this estimate to const - C (n). 

Instead of integers, Kolmogorov and his collaborators considered finite 
binary sequences and constructed a theory which showed that the most 
complex binary sequences are those which approach random behavior. See 
the survey article by A. K. Zvonkin and L. A. Levin in Uspehi Matem. 
Nauk, vol. XXV, No. 6 (1970) (translated in Russian Mathematical 
Surveys), which contains a large bibliography. 
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9.6. Proposition. 
(a) Let 


Pe fy (Sie 29s ee a Oi hea Rh a 


where the f, are recursive functions. Then 


n-1 
C(F) < const - Tec4a(te fi I ccs } 
i=] 
if fo is fixed and f, runs through all possible m-functions. Here const 
depends on fy and on the families used to compute the complexity, but does 
not depend on f,,..., f,. 
(b) Jf fo is also allowed to vary, then \1_, must be replaced by \\?_, and 
log”~' must be replaced by log" on the right. 


9.7. Special cases 
(a) If, for example, we set fp = sum, or prod,, then we have: 


C(fi +h), C(fif) < const C(f, )C(f2) log(C (4 )C(f)). 
(b) If we set n= 1 and m=O, we find that for any enumerable family 
{fi}? 
C(f(k, x... x)) < const C(k). 


9.8. PROOF OF PROPOSITION 9.6. First of all, for every n > 1 we define the 
following recursive bijection with a recursive inverse: 


the index of the n-tuple ¢k,,..., k,> 


n 
if we order n-tuples according to increasin II Kj, 
PNG o k= P A a 


> n 


n 
and in alphabetical order for fixed II k;. 


i=] 


It is easy to see (by induction on n) that 


n—l 
O(k,,...,k,) < const Ile (ios f Il ‘i : 


i=] 


We define the function © : (Z*)"*!-—>Z+ as follows: 


OA, <5 Aaa) = O(Oh, -- Ids rats 


where @ is as described in 9.4. 
We now consider two optimal families v(x,,...,x,,/) and 
U(X],---5X%_» Kk) of p-functions and m-functions, respectively. We use 
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these two families to construct the families 
Ws eee pea cy et) 
ae) 7 6c Senne eo Re | ©. san: oe st Wie coe ee om A 
w(x),...,%»,k)= W(x, Ss aaa @~'(k)). 


The function F occurs in the 


ABCA, )s Pees Cy (fr ))s G (fo)) 


place in the family w. Then the estimate 0(k, /) < k- o(J), along with the 
estimate for 9, give assertion (a). 
We similarly obtain (b), if we replace @ by 9“*” in the definition of w. 


O 


Remark. The function 6 gives us the most economical estimate for 
C(F) which is symmetrical in the C(f,),..., C(f,). In specific situations 
it might make sense to improve the estimate in certain of the C(f,) at the 
expense of worsening the estimate with respect to the others; this is done 
by suitably changing #. For example, Kolmogorov’s 8 gives 


C(fi +f.) < const C(f; Ie) 
which is better than 


const C(f, )C(f,) log(C(f, )C(f2)) 
if C(f.) grows much more slowly than C(/f,). 


9.9. Theorem. The function C (f) is not computable. More precisely, let g(k) 
be any unbounded partial recursive function, and let { f,.} be any enumer- 
able family. Then it is false that C(f,)| pi gy~ 8K). 


Thus, C(f,) can only be computable (even up to ~) on a set of indices 
k such that there are only finitely many different functions among the 
functions f,; otherwise, C (f,) is not bounded on this set. 


Proor. Suppose that C(f,)|p(z)~s(k). We show that there exists a 
general recursive function h : Z* +Z* whose image is contained in D(g) 
and such that geh is monotonically increasing. We then obtain a con- 
tradiction as follows. By 9.7(b), for all k we have 


C (fic) < const C(k), 
and, by our assumption and by the fact that gh is increasing, 
C (fy) 2 const g(h(k)) > const -k. 


But these two inequalities are incompatible, because lim inf C(k)/k =0 
(for example, C (k?)/k? < const/k). 
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It remains to construct h. We choose a general recursive bijection 
h, : Z*+ 4 D(g), using Proposition 5.6 of Chapter V, and we set 


B= (KWi<k, 9(My(i)) <g(hy(&))}. 


This set is decidable and infinite, and g°/, is an increasing function on E. 

Let h, : 2+ +E be an increasing general recursive bijection (again using 
Proposition 5.6 of Chapter V). Then 4 = h,°h, has the necessary proper- 
ties. The theorem is proved. oO 


9.10. Remarks 

(a) Theorem 9.9 shows that computing complexity is a problem demand- 
ing creativity: even if we find the number of a place where f occurs in an 
optimal family {u,}, there is no algorithm which could tell us whether or 
not this function occurs even sooner. 

(b) Since C(k) # C(D Sk ¥/, it follows that for all x and B 


card{ y|y < x, C(y) < x/B} < x/B, 


i.€., most numbers have a large complexity. 

Nevertheless, it is not possible to give effectively a sequence of numbers 
which asymptotically have maximal complexity. More precisely, let {k;} be 
any increasing sequence with C(k;) > k,/B for some constant B. Then the 
set {k;} does not contain a single infinite enumerable set EZ. Otherwise, we 
would be able to find an increasing general recursive function h : Z* > E, 
and would obtain a contradiction, as in Theorem 9.9. 

(c) Let u = {u,} be any optimal family of m-functions. The “moments 
of first appearance” {k|Vi< k,u,#u,} actually form a sequence of 
asymptotically maximal complexity, since, by the definition and by 9.7(b), 
they satisfy 


k = C,(u,) < const: C(k). 


Thus, we might say that in an optimal family the functions first appear “at 
random moments.” 

The problem of computing C(u,) is complicated by the fact that, at 
least in the specific families in the proof of Theorem 9.2, any function 
appears infinitely often, so that if we are not lucky we might first notice 
the function arbitrarily far out from the place where it first appeared. 

(d) Finally, we mention that at least one essential aspect of the complex- 
ity of computations has not been touched upon in our discussion of C,,. 
Namely, log, C(k) measures the length of a program that could compute k, 
but says nothing about the time it takes for such a program to work, let 
alone the possibilities for shortening the time by performing parallel 
computations, lengthening the program, and so on. 

The concept of complexity is rather far removed from practical uses. 
But it seems to be such a fundamental idea that its role in theoretical 
mathematics is likely to grow. 
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Il 
PROVABILITY AND COMPUTABILITY 


CHAPTER VII 


Godel’s incompleteness theorem 


1 Arithmetic of syntax 


1.1. In this section we show how the syntax of formal languages reduces in 
principle to arithmetic. We do this by identifying the symbols, expressions, 
and texts in a finite or countable alphabet A with certain natural numbers 
(i.e., by numbering them) in such a way that the syntactic operations 
(juxtaposition, substitution, etc.) are represented by recursive functions, 
and the syntactic relations (occurrence in an expression, “being a for- 
mula,” etc.) are represented by decidable or enumerable sets. 

In Chapter II we described how this technique works for Smullyan’s 
language of arithmetic, but now we shall investigate it more systematically. 
Our first task is to show that the computability of syntactic operations and 
the decidability (enumerability) of syntactic relations on the sets of expres- 
sions and texts do not depend on how we number them, as long as we 
adhere to certain weak natural restrictions. 

This independence of the method of numbering allows us to consider 
this numbering not only as a technical device, but also as a reflection of a 
deep equivalence between arithmetic and the combinatorial properties of 
formal] texts. In modern computers, where a single store-location may serve 
consecutively as a number, a name (code), and a command, this equiva- 
lence between syntax and arithmetic is realized “in the flesh” and is 
accepted as a basic principle. This was not the case, however, in 1931, 
when Gédel first introduced the concept of numbering. 


1.2. Numbering. Let S be.a finite or countable set. By a numbering of S we 
mean any injective map N : S->Z* whose image is decidable. We call N (s) 
the N-number of an element s € S. We call two numberings N and M of a 
set S equivalent if the partial functions NeM~' and MeN! from Z* to 
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Z* are partial recursive. These functions are automatically computable (not 
only semi-computable), since their domains of definition are decidable (see 
§1~2 of Chapter V). 

The intuitive meaning of these definitions is clear: requiring the set of 
N(s) to be decidable ensures that it is possible to determine whether or not 
a natural number has the property of “being the number of an element of 
S,’ and two numberings are equivalent when each of them can be 
effectively recovered from the other for any s € S. 


1.3. Lemma. 
(a) The relation of equivalence between numberings is reflexive, sym- 
metric, and transitive. 
(b) Any injective map from a finite set S to Z* is a numbering, and any 
two numberings of a finite S are equivalent. 
(c) Any numbering of an infinite set is equivalent to a numbering whose 
image is all of Z*. 


All this either is obvious or has already been proved. In particular, (c) 
follows from Proposition 5.2 in Chapter V. 


1.4. Let S, and S, be two sets, and let N; : S$, 2+, i= 1, 2, be numberings 
of them. We call a partial function f : S,—>S, partial recursive relative to 
<N,, N>> if the map N,°feN,' is partial recursive. A tautological exam- 
ple: any numbering function VN : S->Z?* is partial recursive relative to <N, 
identity». 

A subset 7 C S is said to be decidable, enumerable, arithmetical (..e., 
definable in L,Ar, see Chapter II, §2) relative to the numbering N, if the 
set N,(T) has the corresponding property. 


1.5. Lemma. If <N,, N,> is replaced by a pair of equivalent numberings 
<Nj, N3> in 1.4, then the classes of recursive functions f : S,;—> S, and of 
decidable, enumerable, and arithmetical subsets of S, do not change. 


Proor. The composition of computable recursive functions is recursive 
and computable. The inverse image of a decidable (respectively enumer- 
able) set with respect to a computable function is decidable (respectively 
enumerable). Finally, suppose that f: Zt >Z?* is a partial recursive func- 
tion, and that E CZ* is an arithmetical set. Then f~'(E) = 
pr,((Z* XE) nT;) Gn Z* XZ*). Since Z* x E is arithmetical and I’, is 
also arithmetical (even Diophantine), it follows that f~'(£) is arithmetical. 


O 


1.6. Let S,; be sets with numberings N,,i=1,...,7. A numbering NV: S, 
X +++ § Zt is said to be compatible with <N,,...,N,> if the 
projection pr; : S, x --- X S,—S, is recursive relative to ¢N, N;> for all 
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i=1,...,r, and if the partial function 
(Newey) 


r N 

(z*)’ => S;Xoo+ XS =>zt 
is recursive. In other words, the N,-numbers of the coordinates are com- 
puted from the N-number of the vector, and conversely. 


1.7. Lemma. 
(a) In the notation of 1.5, for any <N,,...,N,> there exists a 
numbering N which is compatible with them. For example, for s, © S,, 
i=1,...,7, we may set 


N(s,,--+35,) = T(N, (5,),.. + N,(5,)) 


(for the definition of +, see subsection 4.5 in Chapter V). 

(b) If N is compatible with <N,,..., N,>, N is equivalent to M, and N, 
is equivalent to M, for i=1,...,r, then M is compatible with 
«M,,...,M_. 

(c) If N is compatible with <N,,..., N,> and M is compatible with 
<N,,...,N,>, then N and M are equivalent. If N is compatible with 
<N,,...5N,> and also with (M,,..., M,>, then N, and M, are equiv- 
alent for alli=1,...,r. 


What all this says is that the relationship of compatibility gives a 
one-to-one correspondence between families consisting of r equivalence 
classes of numberings of the sets S,,..., 5S, and certain equivalence 
classes of numberings of S,x--- x S,. This lemma is proved by 
mechanically checking the definitions. 


1.8. LetA’=AxX +--+ XA (/ times), and let S(A)=A'U A?7U--- UA! 
U:-::. If A is an alphabet, then S(A) is the set of expressions in the 
alphabet. Here A4°= {/\} consists of the empty expression. The function 
S(A)—2Z+ which takes the value p on each element of A?” is called the 
length of the expression. The “ith coordinate” partial function from 
Z* x S(A) to A! given by Ci, <a,,..., a,>) +> a, is defined on the subset 
Uz {i} x(A'UA't*!U +++?) The “juxtaposition” function from S(A) 
x S(A) to S(A) takes 


(<a, +29), SDs 4D ,2) WOO 28 5 Oa Dy cag OL 


A numbering N of S(A) is called admissible if the length function, the 
ith coordinate function, and the juxtaposition function are partial recursive 
relative to <N, id), (<id, ND, N), and (<N, NY, N, respectively. A num- 
bering N of S(A) is said to be compatible with a numbering N, of A if it is 
admissible and if the restriction of N to A! is equivalent to Ny on A (where 
we identify A! with A). 
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Here is the basic result of this section: 


1.9. Proposition. 


(a) If N is admissible, then any numbering equivalent to N is also 
admissible. 

(b) If N is compatible with No, N' is equivalent to N, and No is equivalent 
to No, then N’ is compatible with N96. 

(c) If N and N’ are both compatible with Ng then they are equivalent. 

(d) For any numbering No of A, there exists a compatible numbering N of 
S(A), whose equivalence class is uniquely determined by the class of 
No because of (c). 


Proor. We obtain (a) and (b) formally from Lemma 1.6. To prove (c), we 
find the N-number of an expression from its N’-number as follows. Let 
m*n=N(N~'(m)N~'(n)) (where the argument of N is the juxtaposi- 
tion of the two expressions N~'(m) and N~'(n)). The partial function 
from Z+ X Z* to Z* defined by <m, n)}> m * n is recursive and associa- 
tive, since N is admissible. Further, let (kK); = N (the ith coordinate of 
N~\(k)). The partial function Z+ x Z*+—>Z* : <(k, Db (kK); is recursive 
for the same reason. We similarly define ();, in terms of N’. Finally, let 
l’:Z*-—2Z* be the partial function “the length of N’~'(k).” It is also 
recursive. 
Then we have: 


NeN’-"(k) = NeN’'((k),) # «= + # NON’ "(KY ce): 


But the N’-numbers of the one-letter expressions {(k);} form a decidable 
subset of Z* (namely, the l-level of the computable function /’), The 
restriction of NeN’~! to this subset is a recursive function, since the 
restrictions of N and N’ to A! are equivalent. We obtain (c) from this and 
from the recursiveness of * , (k);, and /’ (by applying induction on x to 
* *_,NoN’~'((k);) and then substituting x = /’(k)). 

We prove (d) using an explicit construction of Gédel (the idea of which, 
incidentally, goes back to Leibniz). 

(d,) Construction of N compatible with No: 


N(a, des Gyn) = p Nola) eas p Nolen), 


where p, =2,p,=3,... are the prime numbers. Here N(/A\)=1. We 
verify that N has the required properties. 
(d,) N is a numbering. First of all, N : S(A)->Z* is an imbedding 
because Ny : A ->2Z?* is injective, and we have unique factorization in Z*. 
We show that the image of N is decidable. In the first place, the set of 
prime numbers in Z* is decidable, since it is the 2-level of the everywhere 
defined recursive function 


nt» the number of divisors of n= ) d(k, n)~ 1, 
k=1 
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where (see §3 of Chapter V) 


2, ifk|n 
d(k, n)= k,n)—ky +1 -{ 
on) s((rem( m) ) ) 1, otherwise, 


s(1)=2, 9 s(22)=1. 


Thus, the function i++ p, is recursive (see the proof of Proposition 5.2 in 
Chapter V). 
We now set 


f(n, i) = s((rem(p?, n) — p? + 1). 
This function is recursive, and hence so is the function of (a, i) 
v,(n) = min{ y| f(x, i, y) = 1} = (the power of p; which divides n) + 1. 


This implies that the “length” function is recursive: 
n 
I(n) = the number of prime divisors of n= > s(v,(n)) — 2 
i=] 
(automatically p,, +n when m > n, since p,, > m). 
Now let E be the image of Ny in Z*. Then 


image of N = {n|Vi < /(n), o(n) EC E+ 1}. 


But the set F = {<i, n)|v,(n) € E + 1} is decidable, since it is the preimage 
of E + 1 under v, and applying the bounded universal quantifier preserves 
the decidability. In fact, let x-(i, n)=1 if <i, n> € F and x,(i, n) =2 if 
<i, n> ¢ F. Then the image of N is the 2-level of the following function of 
an: s(E@x pi, )). 

(d,) N is admissible. We have already shown that the length function is 
recursive. The ith coordinate function is represented by [p,"“”/p,] (the 
integral part). Finally, juxtaposition is represented by the function 


(n) 
mena mll pigs), 
j=l 
which is recursive by what has already been proved. 

We note that our number-theoretic functions are defined on all of Z*, 
not only on the Gédel numbers of any specific numbering. In what follows 
we shall only point out when such an extension of the domains of 
definition is possible if there is a special reason for mentioning this 
possibility. 

(d,) N is compatible with No. The functions xp» 2* and yb» log,(y) 
(y & 22") tell us how to go from one numbering to the other on one-letter 
expressions. These functions are obviously recursive. 

This completes the proof of Proposition 1.9. oO 


1.10. Concluding remarks. Proposition 1.9 shows that, if we are given an 
equivalence class of numberings of an alphabet A of a formal language, 
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then this uniquely determines an equivalence class of numberings of the set 
of expressions 5(A), of the set of texts S(S(A)), and so on, all of which 
are compatible with the numberings of A in the given class. Hence, the set 
of recursive operations and the set of decidable or enumerable relations 
are invariantly defined on the expressions and texts. The only nonunique- 
ness that remains is the choice of the equivalence class of the numbering of 
A. 

In all cases of which the author is aware, this choice is also determined 
canonically in the following way. Namely, A is realized as a decidable 
subset of the expressions in some finite “protoalphabet” Ay, where decidabil- 
ity is understood in the sense of any numbering of S(Aj) which is 
compatible with any numbering of Ao. It follows from Lemmas 1.3 and 1.5 
and Proposition 1.9 applied to Ap that the resulting class of numberings of 
A will not depend on either the imbedding of A in S(AQ), the numbering 
of Ao, or even the choice of Ay (where we recall that, if 4p C A, are finite, 
then S$(A,) C S(A)) is decidable). 

From this point of view, it is natural to consider the nine-letter alphabet 
of SAr which was described in §10 of Chapter II to be a protoalphabet. 
Then x, x’,x”,x’”,... are elements of the “real alphabet.” Smullyan’s 
particular numbering system is very convenient for proving Tarski’s theo- 
rem, but the “undefinability of truth” in SAr does not depend on the 
special form of this numbering, as should by now be completely clear. 

More generally, any complete printed description of any alphabet A 
realizes A in the protoalphabet of available typographical symbols, which 
is of course finite, and thereby determines a canonical equivalence class of 
numberings of 4. 


2 Incompleteness principles 


2.1. Gédel’s theorem on the incompleteness of formal theories can be given 
many precise formulations, none of which entirely exhausts its content. In 
this section, using the results obtained in §1, we shall try to separate the 
conceptual aspects of the theorem from the technical details needed to 
prove it for various languages. 


2.2. Let A be a finite or countable alphabet with its canonical equivalence 
class of numberings, and let S(A) be the set of expressions in A. We 
suppose that the following two subsets of S(A) have somehow been 
defined: 

(a) T Cc S(A), the set of “true” expressions. For example, we might have 
been given a language with A as its alphabet, some sort of semantics for 
the language, and a truth function. 

(b) D C S(A), the set of “provable” or “deducibie” expressions. This set 
might be described by giving “axioms” and “rules of deduction,” or in 
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some other way. We shall always assume that D CT, as the terminology 
suggests (it is only possible to prove what is true). 

There is every reason to expect that, if D and T have been constructed 
“in a natural way,” in the process of formalizing some fragment of modern 
mathematics, then the following principles hold true. 


2.3. The set D is enumerable. The intuitive arguments to support this 
assertion are as follows. Suppose that the “provable” expressions are those 
for which “proofs” exist. Here “proofs” are certain texts which, perhaps, 
are written in another alphabet B, i.e., they are elements of S(S(B)). (For 
example, theorems in L,Ar may be proved in L,Set.) One minimal require- 
ment for formal mathematical proofs is that it must be possible mechani- 
cally to determine that they are proofs, i.e., they must form a decidable 
subset of S(S(B)). (Here it would actually be sufficient to require that the 
set of “proofs” be enumerable.) Another unavoidable requirement is that 
from every proof we must be able to obtain mechanically the “expression 
proved” in S(A). In other words, the partial function from S(S(B)) to 
S(A) given by “proof”}> “expression proved” must be (semi-) comput- 
able. But then the image of this function is enumerable. In §5 we show that 
the set of deducible formulas in £, is enumerable, in accordance with these 
informal considerations. 

We note that a time aspect has implicitly entered into the discussion. A 
“proof” is understood to mean a “proof using the means accepted at the 
present time and (semi-) identifiable as being accepted.” If, for example, 
we introduce a new axiom of set theory and it becomes widely accepted, 
then the concept of a proof becomes broader, as happened with the axiom 
of choice (or, rather, the principle of transfinite induction, Zorn’s 
lemma, . . . ). See the discussion in §7. 


2.4. The set T is not enumerable if the semantics of truth is rich enough to 
include elementary arithmetic. We clearly have in mind some version of 
Tarski’s theorem, which, in fact, even tells us that 7 is not an arithmetical 
set. In the next section we give several precise formulations of this 
principle. (See also subsections 7.3—7.4 below.) 


2.5. Gédel’s incompleteness theorem (General form). A// formal theories of 
mathematics satisfy the principles 2.3 and 2.4. Therefore, if a theory is 
sufficiently rich, it always contains true expressions which are not provable. 


3 Nonenumerability of true formulas 


The following criteria are all variations on a single theme, even if this is 
not obvious at first, namely, “self-reference, or the diagonal process.” 


3.1. The language SAr. We refer the reader to §10 of Chapter II for the 
description of this language and its standard interpretation. In §11 of 
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Chapter II we showed that the set of numbers of true formulas in 
Smullyan’s numbering system is nonarithmetical. This set is a fortiori 
nonenumerable, since enumerable sets are even Diophantine. 


3.2. The language L,Ar. Here we give two versions of the argument, one of 
which gives the stronger result and the other of which gives the more 
concrete result. A third version, which is closer to Gédel’s original proof, 
will be described in §7. 

(a) Tarski’s theorem for L,Ar. The proof that the set of true formulas in 
L,Ar is nonarithmetical can be reduced to Tarski’s theorem for SAr in the 
following way. In the first place, the sets of formulas in L,Ar and SAr are 
decidable in the set of all expressions (this will be shown for pbiAt in §4). 


In the second place, the translation map {formulas in SAr} => {formulas 
in L,Ar}, which was described in §10 of Chapter II, is recursive (as is 
easily shown using the arguments in the next section). Since the map tr 
preserves the truth function, we have T, = tr(T,,) in the obvious nota- 
tion. But then, if T,, were arithmetical, it would follow that T, is also 
arithmetical (see the proof of Lemma 1.5), which contradicts Tarski’s 
theorem for SAr. It would be a useful exercise for the reader, after first 
reading §4, to carry out this proof in complete detail. 

The following argument is simpler and more precise, but it only shows 
that T,, is nonenumerable, and not that it is nonarithmetical. 

(b) Let E CZ* be an enumerable but undecidable set (which exists by 
§5 of Chapter V). Let E be defined by the formula P(x) in L,Ar which has 
one free variable x. For n > 2 we setn = + (+ (i +(1, 1)) se ), which is 


the term-name for the integer n in the obvious canonical ©, type notation. 
We consider the family of closed formulas { —(P(7))|n ©2Z* } in LAr. 


3.3. Proposition. 


(a) The function Z*+ > { formulas in L,Ar} given by nb» “P(n) is 
recursive. 
(b) The set {n| =(P(A)) is true} is nonenumerable. 


Corollary. T, is nonenumerable; more precisely, the set of true formulas in 
the family { =(P(n))} is nonenumerable. 


(If 7, were enumerable, its preimage in Z* would also be enumerable.) 


PROOF. 

(a) Let the formula —(P(x)) have the form R, X R,X +++ X Rg, 
where x does not occur in the expressions R;. Using the same notation as 
in the proof of Proposition 1.9, for a fixed numbering N of the set of 
expressions with juxtaposition function * we have: 


N( 7(P(a))) = N(R) * N(a) * N(R) * as * N(Rs4; ). 
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Hence, it suffices to show that the function n}» N(n) is recursive. But, 
since n+ 1= +(l, 7), it follows that for n > 1 


N(n+1)=N(+)*N((’) * N(1) * (A) * N(’)’), 


which expresses N(n + 1) recursively in terms of N (i). 

(b) {n| A(P(a)) € T,, } =Z* \ E by the definition of the formula P(x) 
defining £. But the complement of E is nonenumerable, since £ is 
undecidable. 

The proposition and the corollary are proved. oO 


3.4. Languages at least as rich as L,Ar. Let L be an arbitrary language with 
a (finite or countable) alphabet A, in which we are given a set T of “true” 
expressions. We suppose that L is no poorer than a language of arithmetic 
in the following sense: There exists a translation map 


tr : { formulas in L,Ar} => { expressions in A} 


which takes T,, to T, takes the complement of T,, to the complement of T, 
and is recursive. 

Then T is nonenumerable. 

Such a translation map can be constructed for L,Set, for example. 
Proposition 3.3 shows that, actually, we need only know how to translate 
into L the formulas in the family —(P(7)); this allows us to use a very 
modest language of arithmetic. 


3.5. Remarks 
(a) The series of Diophantine problems “Is P (7) true?,” i.e., “Does the 
Diophantine equation F(n; x,,..., x,) =0 have a solution in Z*?” (where 


F is a suitable polynomial with integer coefficients, see Chapter VI) has the 
property that no finitely describable collection of means of proof is 
adequate to answer this series of questions completely. One might say that 
even the theory of Diophantine equations is infinitely complicated. 

(b) In some sense any problem in mathematics reduces to a Diophan- 
tine problem. In fact, after translating the problem into a suitable formal 
language, we may just ask, “Is the formula P or the formula —P 
provable?” But this is precisely the same as asking whether the number of 
P (the number of —1P) belongs to the enumerable set D of provable 
formulas, i.e., whether the Diophantine equation corresponding to D in the 
given series is decidable. 

This gives somewhat unexpected support for Gauss’s opinion regarding 
the queenly status of arithmetic. There even exists a “queen of the 
Diophantine equations” whose graph projects onto the set of numbers of 
formulas in L,Set which are deducible from the Zermelo—Fraenkel axioms. 

But, of course, we normally ask “Is P true?” and not “Is P provable?.” 
From this point of view, the most creative activity in mathematics is the 
discovery of new principles of proof which do not reduce to the “legacy of 
the past” and which again must be taken on faith. Set theory as a whole 
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was the most recent such principle in the modern development of mathe- 
matics. The dramatic history of its creation and of the disputes surround- 
ing its acceptance is worthy of a discovery of this magnitude. 

It is amazing that within formal mathematics it is possible to say 
something about such informal things. See also §7 below. 


4 Syntactic analysis 


4.1. This section contains the preliminary technical material which will be 
needed in §5, when we prove that the set of deducible formulas in a 
language of °, is enumerable. 

Let L be a fixed language in ©, having a finite or countable alphabet A. 
In order to shorten the technical work somewhat, we assume that we are 
working with a dialect which contains only the connectives — and — and 
the quantifier V. This is not in any sense essential. As in §1, we have a 
canonical equivalence class of numberings of A, which determines number- 
ings of S(A), S(S(A)), and so on. The terms “recursive,” “decidable” etc. 
will be understood to refer to this class. Thus, we may omit explicit 
mention of the numbering in the statements of the basic results. But in the 
proofs it will be more convenient to work directly with a numbering. We 
therefore fix one of the numberings VN: S(A)—2Zt* with juxtaposition 
function +, length function /, and ith coordinate function (k),, as in the 
proof of Proposition 1.9. We shall assume that m * n > max(m, n), Le., the 
number of any part of an expression is strictly less than the number of the 
whole expression. Such an N is called a Gédel numbering. 

In addition to the conditions given in §1, we require that N satisfy the 
following conditions regarding recognition of the syntactic characteristics 
of the symbols of the alphabet: 


(a) The sets of variables, of constants, of operations, and of relations in A are 
decidable. 
(b) The “degree” function on the set of operations and relations is recursive. 


We are now ready to begin. But before reading further the reader is 
advised to review §1 of Chapter II. 
4.2. The partial function from S(A)X Z* to Z* given by 


the place in P containing the right parenthesis which 


an expression P,i 33 . y 
< ‘ ae aes to the left parenthesis in the ith place in P 
is compatable, i.e., it is recursive and has a decidable domain of definition. 


Proor. It will be convenient to use the following notation: if Q is a 
statement about integers in Zt, then 


_ fl, if Q is true, 
iai={5 if Q is false. 
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This is a truth function which has been adjusted so as to take values in Zt, 
which does not contain zero. 

We construct a function Par(k, 7): Z*+ x Z*+ 2+ as follows: if (A); is 
not defined, or if (k);# N(“(”), or if (kK); = M(“(") but Wi € [ i, /(k)], 
Va illKm = N(“C)I # Zrnillm = N(“)”)I|, let. Par(k, i) = 1; other- 
wise, let Par(k, i) = min{j|j < /(k) and 34,_|||(k),, = N(“(’)|| = 
Yi, =ill(Km = N(“)”)||}. Obviously, when restricted to N~'(S(A)) x Zt, 
the function Par(k, i) gives the place in the expression N ~'(k) containing 
the “)” which corresponds to the “(” in the ith place if this is possible, and 
gives 1 when this is not possible. (Compare with Lemma 1.2 in §1 of 
Chapter II.) Hence, it suffices to show that Par(k, i) is recursive. But 
Par(k, i) has been defined by gluing together a finite number (four) of 
recursive functions having decidable domains of definition (by the proper- 
ties of N). Thus, Par(k, 7) is recursive. O 


4.3. The partial function S(A)—>Z* given by (an expression P)\}> (the 
number of terms in L which are juxtaposed to get P) is computable. 


We recall that this number is uniquely defined (§1 of Chapter II). 


PRooF. We first construct a formula which defines the function 


1(k) +1, if N~'(k) is nota 
LT (k) =, the number of terms whose juxtaposition of terms; 
juxtaposition is N~'(k), otherwise 


from Z+ to Z* recursively in terms of its values on smaller values of the 
argument. The way to carry out this syntactic analysis of N~'(k) can be 
described verbally as follows: first see whether or not N~'!((k),) is a 
variable or constant and, if it is, whether or not N~'((k), * +--+ (k) Hip) iS 
a juxtaposition of terms; if N ~~ '((k),) is not a variable or constant, check 
whether it is an operation, and, if it is, whether it is followed by “(,” 
whether there is a corresponding “),” whether a juxtaposition of the 
required number of terms lies between the “(” and the “),” and whether “‘)” 
is followed by a juxtaposition of terms. 
To describe this procedure systematically, we set 


fi(k)= Ae #222 (Kgs if1(K) > 2; 
1, otherwise: 

fy(k) = (k)j* +++ # (Kk) parck, yp W4< Par(k, 2); 
i; otherwise; 

fx(k) = { ruc yert 77 CA) if 1 <Par(k, 2) <1(k); 
1, otherwise. 


All of these functions are recursive. 
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We now write the following recipe for computing L7(k) recursively: 
N~'(k) isa variable=> LT (k) = 1, 
I(k)=1 and {N~'(k)isaconstant>LT(k)=1, 
N ~'(k) is neither a variable nor a constant => LT (k) = 2; 
I(k)>1 and N7~'((k),) isa variable=> LT(k) = 1+ LT(f,(k)); 
i(k) >1 and N~'((k),) isa constant=> LT(k) = 1+ LT(f,(k)); 
i(k) >1, N7~'((k),) is an operation, (k),=N(“("), 
4 < Par(k, 2) = /(k), 
degree N~! ((k),) = LT(f()) < (A(R) LTA) = 1; 
I(k)>1, N~'((k),) isan operation, (k),=N(“("), 
4 < Par(k, 2) < /(k), 
degree N~' ((k),) = LT(f,(k)) < (2 (k)), 
LT (f3(k)) < [(f,(k)) => LT (k) = 1+ LT(A(k)); 
1(k) >1 and none of the previous additional conditions hold 
=> LT(k) = 14 I(k). 


To show that LT is recursive, we first note that, for each of the above 
eight alternatives, we can easily construct a recursive function h,(k, x, y, Z) 
with the following property: 


|| satisfies the ith alternative|] = A,(k, LT(f, (k)), LT (f2(k)), LT (43 (k))); 


and we can also construct a recursive function v,(k, x,y,z) with the 
property that é satisfies the ith alternative = 


LT(k) = o(k, LT(f, (k)), LT (f2(k)), LT (fs (K)))- 


We therefore have the equation: 


8 
LI(K) = 23 o(k, LT(hW), LT(H(4), LT) 


8 
= 3 Ched(e LT (4), LT(4:(K)), LTH (4))). 


Since f(k) <k for k > 1, this formula allows us successively to compute 
the values of LT(k), starting with LT (1). But the recursion here computes 
the value at k not in terms of the value at k — 1, but in terms of several 
earlier values. It is this which presents the basic difficulty in showing that 
the syntactic functions are recursive. We now describe the device for 
overcoming this difficulty here and in all future cases. 
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In general, let $,(k),...,¢,(k) be recursive functions having the 
property that $(k)< kA for all i<s and k >2. Further, let 
A(x, . - - 5 Xs ky Vy + - > Ys) be a recursive function, and let 
2(X,,..- X,, k) be defined by the relations 


g(X,,..., X,, 1) =some known recursive function, 
B(Xy,- 66 Xp KEV H A(X, - Xp Ke B(Xp- +s Xp O1(K)), 
Ce ¢,(k))). 
Using the juxtaposition function * , we let 


k 
G(x)... %p KD= ¥ 2(%),--- 5 Xp A). 
1 


Since 7 
(es et) (GO oy BS k)); 


for all i < (G(x, ..., x,, k)) =k, and in particular for the greatest such 
i, it follows that, to verify that g is recursive, it suffices to show that G is 
recursive. But for k > 2 we have 


Gio ski PD) SG ys. Bek) oO pss Ak YD 
= G(X... X_, kK) 
or) Gaeinenee ant om (62 67 eee com) perrececarereen (04 6s rerarare AY.) Pee 


which is in the standard form for a recursive equation. 
If we apply this device to LT, setting n = 0, s = 3, and $,(k) = f,(k + 1), 
we obtain the recursiveness of LT. oO 


Corollary. The set of terms is decidable. 
In fact, this set is the l-level of the computable function LT. 


4.4. The set of atomic formulas is decidable. 
In fact, 
N ~'(k) is an atomic formulae>(k), is a relation, (k),= N(“(’), 
Par(k, 2) = /(k) > 4, 
and degree N~' ((k),) = LT(f,(k)) < /(f2(k)), 
where f,(k) was defined in 4.3. 


4.5. The set of formulas is decidable. 
In fact, in our dialect, which has been simplified to include only 7, >, 
and V, we have: 


N~'(k) is a formula 
<=N ~'(k) is an atomic formula, or is of the form —(P), 


(P)=>(Q), or Vx(P), where P and Q are formulas and x is a variable. 
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Using the procedure in 4.3, we define the recursive functions 


f,(k) = fae to (Kay if (K) > 4 
1, otherwise; 
f5(k) = Ne eo * (park, y-v if Par(k, 1) > 3; 
1, otherwise; 
ae (Kak yest °° * ®yggyy  if 3 < Par(k, 1) <1) - 1; 
I, otherwise; 
fi(k)= he to # (kay vif L(K) > 5; 
I, otherwise; 


At(k) = | 1, if N~'(k) is an atomic formula; 
2, otherwise. 


The function 
Fm(k) = | 1, if N~'(k) is a formula, 
2, otherwise 


is computed using the following recursive relation (where s(1) = 1 and 
5(k) = 2 for k > 2): 


Fm(k) = somin{At(k); |I(K), = NC 7)I- A = NEC DI I) > 4 
-Fm(f,()); 
A= NOC) [[Par(k, 1) > 3|] - Fm(fs (x) 
“l(A)parck, D+1— N(“”)|| - (parce, )N+2> N(“("))l 
-||Par(k, Par(k, 1) + 2) = /(k)|| - Fm(f, (k)); 
(A) = NCW")| - |[(K)2 = N(a variable)|l - |I(K); = NC“) 
-||Par(k, 3) = i(k) > 5|| - Fm(f, (k))}. 
Fm(k) is now shown to be recursive using the device described in 4.3. 
Corollary. The sets of formulas of the form —(P), (P)—(Q), and Wx(P) 
are decidable. 


4.3. The following function from S(A) X Z* x S(A) to S(A) is computable: 
<P, i, O>b the result of substituting P for the ith symbol in Q. 


We set 
(m), eae ae is (m),_, +k * (m), ie (™) my 
Sub(k, i, m) = 5 if i < 1(m); 
1, otherwise. 
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This function is clearly recursive, and coincides with the required map on 
the set of <k, i, m> with k, m& N~'(S(A)). 0 


4.7. The following relation in Z* X S(A) X S(A) is decidable: “the one- 
letter expression x is a free variable in the ith place in the formula P.” 
If fact, we set 


1, if the condition in 4.7 holds for P= N~!(k) 
Fr(i, k, /) = and <x) = N~!(J); 
2, otherwise. 
Then we have: 
N~'(k)is nota formula, or N~'(/) is not a variable, 
or i> M(kK)=>Fr(i, k, I) =2. 


Now suppose that N ~!(k) is a formula, N ~'(/) is a variable, and i < /(k). 
Then the following alternatives remain: 


1#(k), > Fr(i, k, 2) =2; 
1=(k),  At(k)=1=>Fr(i,k, )=1; 
1=(k),, | N~'(k)has the form —(P) 
=>Fr(i, k, 1) = Fri, fs (k), !); 
1=(k),, N~'(k) has the form (P)—>(Q), i < Par(k, 1) 
=> Fr(i, k, 1) = Fr(i, fs (k), 1) 
1=(k),, | N7~'(k) has the form (P)—>(Q), i > Par(k, 1) +2 
=>Fr(i, k, 1) = Fri, fg (k), 1); 
1=(k), | N7'(k) has the form Wx(P), (k)) = /->Fr(i, k, 1) =2: 
1=(k)jy  N7~'(k) has the form Wx(P), (@) #/ 
=> Fr(i, k,l) = Fr(i, f,(), 1). 


Here the functions f;, f,, and f, were defined in 4.5. The rest of the proof 
that Fr is recursive follows the same procedure as in 4.4 and 4.5. im 


4.8. The set {<x, P, t>|x is a variable, P is a formula, t is a term, and x does 
not bind t in P} is decidable. 
In terms of the numbers <i, k, m>, this condition means that: 


Vj < /(k) {either (k); # i, or else (k), = i AFr(j, k, i) =2, 
or else (kK), = i A\ Fr(j, k, i)=1 
AWn €[1, /(m) |(Fr(j + 1 = 1, Sub(m, j, k), Sub(m, j, k) 4-1) 
= ||(m),, isa variable||) }. 
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That is, if ¢ is substituted in place of any free occurrence of x in P, all the 
variables in ¢ remain free. oO 


4.9. The following partial function is computable: <x, P, t>}> the result of 
substituting t in place of all free occurrences of x in P. 
Let <i, k, m> be the numbers of x, P, and r. We set 


(k);, if Fr(y, k, i) = 2; 
m, if Fr(j, ki) =1. 


This is a recursive function. We further set 
i(k) 
Sub 1(i,k, m)= * f(/, k, i, m). 
j=) 


This is the number of the expression obtained by substituting ¢ in place of 
all free occurrences of x in P. Oj 


Sj, ky i, m) = 


5 Enumerability of deducible formulas 


5.1. General setup. Let L be any language with a numbered countable 
alphabet A. We suppose that the following data is fixed: 


(a) An enumerable set of “axioms” Ax Cc S(A). 
(b) A partial recursive function Inf : Z+ x S(S(A))— S(A), ie., an enu- 
merable family of “rules of deduction.” 


We shall say that an expression P € S(A) is a direct consequence of the 
expressions P,,..., P, by the ith rule of deduction, if (i, <P\,..., PE 
D(Inf) and Inf(i, <P,,..., P.>) = P. We shall call an expression P deduc- 
ible (from the “axioms’) if there exists a finite sequence of expressions 
P,,...,P, =P such that for each j <n either P/E Ax or there exist 
i€Z* and {P,,...,P,} C{P,,.--, Py} such that P, is a direct con- 
sequence of P,, .. Py ‘by the ith rule of deduction. We let D denote the 
set of all deducible: expressions. 


5.2. Proposition. D is enumerable. 


Proor. Let a : Zt > S(A) be a recursive function whose image coincides 
with Ax, and let inf : Z+ + S(A) be the partial recursive function given by 
inf(n) = Inf(¢(n), Ny! (e(n))), where N, : S(S(A))>Z* is any num- 
bering of the texts which is compatible with the given numbering of the 
expressions. 
We construct a recursive function d : Z* + S(A) as follows: 
d(2n — 1) = a(n), 
d(2n) = inf(n), n>. 

We claim that its image is D. In fact, it suffices to verify that (a) 
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Ax C image of d; and, (b) if P,,..., P, € image of d and P is a direct 
consequence of P,,..., P, by the ith rule of deduction, then P € image of 
d. 


But (a) is obvious, since all the axioms are written out in the odd 
numbered places. To verify (b), we choose n so that 


W(n)=i,  tP(n)=N\(CPy---, P,>)- 
Then d(2n) = P. The proposition is proved. O 


We now verify that the general setup in 5.1 can always be realized in 
languages of £,. 


5.3. The rules of deduction Gen and MP. We define the map 
Inf : Z*+ x S(S(A))—> S(A) as follows: 
D (Inf) = {<1, <P, (P)>(Q@)>>|P and Q are formulas } 
U {<i, <P>>|P is a formula, i > 2}, 
Inf(1, <P, (P)>(Q)>) = Q, 
InfCi, <P>) = Wx;_,(Q), 
where x, is the jth variable in L in any fixed numbering of the variables 


which has image Z* and is compatible with the numbering of A. It is clear 
that Inf is recursive and exhausts the rules of deduction Gen and MP. 


5.4. The axioms. We verify that the following sets are enumerable in any 
language in £,: 


(a) The tautologies. 
(b) The logical quantifier axioms. 
(c) The axioms of equality. 


Two other sets we show to be enumerable are: 


(d) The special axioms of L,Ar. 
(e) The special axioms of L,Set. 


Actually, using the methods of §4 it is not hard to prove that all of these 
sets are even decidable. But the proof of enumerability is somewhat 
shorter, and will suffice for our purposes. 


5.5. The tautologies. In §5 of Chapter II we constructed a finite list of basis 
tautologies and showed that all the other tautologies can be deduced from 
them using MP. Thus, by Proposition 5.2, it is sufficient to verify that the 
basis tautologies are enumerable. 

Each of the basis tautologies determines a set of formulas of the form 


QP; OoP,, ra P,O.41 
where the Q, are fixed expressions which are nonempty (with the possible 
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exception of Q, and Q,,,); i),...,.7,E {1l,..., m} and the (P,,..., P,,> 
varies over all ordered m-tuples of formulas in L. Since the set of such 
m-tuples is decidable by 4.5 above, and since the operation of juxtaposition 
iS recursive, it is clear that we obtain an enumerable set of formulas. 


5.6. The logical quantifier axioms. In case our dialect of £, does not have J, 
these axioms can be expressed as the following two axiom schemes: 


(a) (Wx(P(x)))—>(P(1)), if x does not bind the term ¢ in the formula P. 
(b) (Wx((P)>(Q))) -((P) >(Vx(Q))), if x does not occur freely in P. 


By 4.8, the set of triples {<x, P, f>|x does not bind ¢ in P} is decidable, 
and, by 4.9, the map <x, P, t+ P(f) is recursive. Since juxtaposition is 
also recursive, the set of axioms (a) is the image of a decidable set under a 
recursive function, and so is enumerable. 

We may similarly conclude that (b) is enumerable if we verify that the 
condition “x does not occur freely in P” is decidable. But this is equiv- 
alent to the following condition: “the formula obtained from P by sub- 
stituting either of the variables x, or x, in place of all free occurrences of x 
in P coincides with P,”’ where <x,, x,> is any fixed pair of distinct 
variables. This condition is decidable by 4.9. 


5.7. The axioms of equality. By the definition in 4.6 of Chapter II, it suffices 
to show that the set of formulas of the form 


(x =y)=>(P(x, x)= P(x, y)) 


is enumerable, where P runs through the atomic formulas in the language, 
x and y are variables, and P(x, y) is obtained from P by replacing x by y 
in any subset of the occurrences of x in P. This set of formulas can be 
obtained, for example, as the image of the following function, which is 
partial recursive by the results in 4.4 and 4.6; 


S(A) x A! x A! S(Z+)—> S(A); 


the expression obtained by sub- 
stituting y in the i,,..., 7, places 
in the atomic formula P if x 
occurs in those places. 


cP; Os {ys Cy eae iP 


5.8. The special axioms of L,Ar and L,Set. Most of these axioms only 
contain variables of the language, and not “metalanguage” variables for 
formulas. This is true of all the axioms of arithmetic except for induction 
and all the axioms of set theory except for replacement. Each set of axioms 
not containing variable formulas is decidable because it can be described 
by a condition such as “the set of formulas of length 40 in which “(” is in 
the first place, “W” is in the second place, a variable is in the third place, 
“( is in the fourth place,..., “)” is in the 39th place, and “‘)” is in the 
40th place; in which the variables in the 3rd, 8th, and 16th places are the 
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same, in the 9th and 36th places are the same, and in the 17th and 37th 
places are, the same; and in which these three variables are pairwise 
distinct.” (This is the axiom of regularity in L,Set in normalized notation.) 
Here we could also write down just one copy of each such axiom and 
generate the rest using Gen, the axiom of specialization, and MP. 

The axioms of induction and replacement are shown to be enumerable 
using the same procedure as in the case of the basis tautologies and the 
quantifier axioms. We leave the details to the reader. 


6 The arithmetical hierarchy 


6.1. Using recursion on n, we define the classes 2, and II, of subsets of 
(Z*)", m=0, 1,2,..., as follows: 


(a) Zo = II = {decidable sets}. 

(b) 2,4, = {projections of elements of II, having codimension > 1}. 

(c) IL,,,; = {complements of elements of =,,, in their ambient spaces 
(Z*y"}. 

Obviously, 2, consists of all enumerable sets (see Theorem 1.2 of Chapter 

VI), and II, consists of their complements. The following result justifies 

calling {%,, I} “the arithmetical hierarchy.” 


6.2. Proposition. 
(a) Vn 20, 2, UT, C>,,, 91,4). 
(b) U~_ 2, = UN oll, = {arithmetical sets}, i.e., all sets definable 
by formulas in LAr. 
(c) For n > | the sets in , are precisely those which can be defined by 
formulas of the following £, type (where the quantifiers are taken over 
variables in Z*, and E is a decidable set): 


Ax, Vx, dx3- + > Vx, TCX Xa Mpg +> Xm > EE), neven; 
Sx) Vixg- dike FR Oo a hi Bia es HPCE), n odd. 
Similarly, for II,,: 

Woe diy Vege (ON hs ye Mee ey ROSE) n even; 
Vx, Fx, x35 + + WX, TKK Xp Mpg Xp EE), nodd. 


(d) The .sets in X, or II, are definable by the analogous formulas in 
L,Ar with the following changes: instead of <x,,...,%m> © E we have 
any atomic formula, and the number of quantifiers is > n, with exactly 
n— | alternations from 3 to V or W to A. 


PROOF. 

(a) We use induction on n. For n = 0 we have Xp U Ip = 2, NT, by the 
definition of decidable sets. If 2, _,C,, then =, CE,,, (since 5,4, 
consists of projections of the complements of elements of =,, and =, 
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consists of projections of the complements of elements of %,,_ |), and also 
IL, cQ,.,, by the definition of II. Finally, we have II, Cc 2,,,, from 
which it trivially follows that =, CTI,,4,. In fact, if E EI, then & x Z* 
EI, (since taking the product with Z*+ commutes with complements and 
projections, and takes >, =I], to itself), and hence E =a projection of 
EX2Z* E24). 

(b) It follows from (a) that U p-0™n = U>_)Il,. This class of sets is 
contained in the arithmetical sets, since "all enumerable sets are arithmeti- 
cal, and arithmeticality is preserved when taking projections and comple- 
ments, which correspond to inserting 3 and —, respectively. 

In order to prove the converse {arithmetical sets} C U*_,2, = 2, we 
first note that all sets definable by atomic formulas are decidable, and the 
rest of the arithmetical sets are obtained from them by taking projections, 
complements, unions, and intersections (see §2 of Chapter II). Thus, it 
suffices to show that 2, is closed with respect to (finite) unions and 
intersections. We claim that this is actually true for each &, separately. 

We prove this by induction on n. The result has already been proved for 
Xp. If =, is closed with respect to ™, then II, is closed with respect to U. 
Suppose £), FE, €=,,,,, E; =a projection of F, and F, EII,. We can then 
introduce dummy variables so as to identify the ambient spaces of the F; 
and the projection of these spaces onto an ambient space for both the £,. 
Then E, U E, =a projection of F,U F,, so that E,U F,€2,,,. Thus, 
2,41 18 closed with respect to U. 

Similarly, if =, is closed with respect to U, it follows that IT, is closed 
with respect to M, and an analogous argument shows that ,,,, is closed 
with respect to ™. However, here we must embed the products F; x (Z*)”” 
and (Z+)"' x F, for certain m, and m, in a single space in such a way that, 
when we identify the two projections, we have pr(F, x (Z*)"M (ZT)™ x 
F,)=pr F, pr F>. In terms of formulas this means that the variables 
bound by the 3 quantifiers in the formulas corresponding to F, and F, 
must be renamed so that they form two disjoint sets. 

(c) This assertion is proved by induction on n and a simple examination 
of the definitions. Here, whenever we take the complement, we must move 
the corresponding — to the right of all the quantifiers by means of the 
usual commutation rule “V=4—, 74=V—. If we have a projection 
of codimension m2 2, which is defined by a series of quantifiers 
Ax;,-- + dx;,, we must reduce it to a projection of codimension 1 by 
replacing the set of variables (Xj 92229 > by Ky), LY) in E 
and replacing the series of quantifiers by "ay. 

(d) The proof is analogous to that of (c). Here we use the fact that the 


sets in Sy are Diophantine, and we observe that, in general, J- - - 3 can 
not be replaced by J in this case. 
The proposition is proved. Ol 
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6.3. Theorem. For all n > | 
=, \I, #@, IL, \2, #@.- 


Proor. The assertion that =, \II, #@ is precisely Theorem 5.8 of Chapter 
V on the existence of undecidable enumerable sets. We prove the general 
case by an analogous diagonal process applied to a versal family. 

Let {£,} be a versal family of enumerable (n + 1)-sets over Z*, and let 
E be its total space: 


Ck, Xo. ++ Xp © ESK Xo, «XH E Ey. 
To fix ideas, suppose n is even. We set 
F= (k\ax, Wx,-- - Wx, a(<k, kK, xy. ++, XE E)} CZ. 


By 6.2(c), we have F € 2. Since {E£,} is versal, it follows by 6.2(c) that 
any subset of Z* in II, can be represented in the form 


F,, = { Xo] A(Ax, Wx2- + Wx, (Kho Xo Xp +++ > € E))} 


for some ky €Z*. It is clear that k, lies either in F \ F,, or in F,,\ F. Hence 
F#F,, and F €%,\II,,. 
The other cases are handled analogously. oO 


6.4. Remarks 

(a) From the point of view of the theorems of Tarski and Gédel, the 
results in 6.2 and 6.3 show us the tremendous distance from provability to 
truth: D € &,, while T falls not only outside =), but even outside 2,,. In 
the next section we indicate some mileposts along the way from D to T. 

(b) Although not really formally justified by the above considerations, 
nevertheless it makes sense to classify arithmetic problems, i.e., questions 
“Is it true that P € T?,” according to the number of alternations between 
J and V when the closed formula P is written as in 6.2(c). 

As we showed in §1 of Chapter I, the Fermat conjecture is expressed by 
a II,-formula, and the Riemann hypothesis is expressed by a II,-formula, 
although there is an assertion of type II, which is equivalent to the RH. 

H. Rogers writes that, 


“Almost all statements which (i) have been extensively studied by 
mathematicians and (it) are known to be arithmetically expressible can be 
seen, from a relatively superficial examination, to have quite low level in 
the &,, classification. As has been occasionally remarked, the human mind 
seems limited in its ability to understand and visualize beyond four or five 
alternations of quantifier. Indeed, it can be argued that the inventions, 
subtheories, and central lemmas of various parts of mathematics are 
devices for assisting the mind in dealing with one or two additional 
alternations of quantifier.” 
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7 Productivity of arithmetical truth 


7.1. In this section we discuss a final feature of Godel’s theorem: the 
possibility, starting from any enumerable set of truths of arithmetic which 
we already know, effectively to enlarge this set by adding new truths. To 
see this more clearly, we examine the original version of the proof, in 
which the diagonal method is explicit, rather than hidden in the construc- 
tion of an undecidable enumerable set. It is convenient to describe this 
version by comparing it with the proof of Tarski’s theorem. 


7.2. Suppose we are given a language of arithmetic (L,Ar, SAr, or an 
extension of one of them). Further suppose that we have chosen a fixed 
numbering of its alphabet, which determines a fixed numbering N of the 
formulas. (It is essential to note that the construction which follows is not 
invariant if we replace our numbering by an equivalent one.) 

Both the Tarski and the Gédel arguments are based on the following 
“self-reference lemma:” 


7.3. Lemma. Given any formula P(x) in the language which has one free 
variable, we can effectively construct a closed formula Qp which says, “my 
number does not belong to the set defined by P.” In other words, Qp is true 
if and only if P(N (Op )) is false, where N(Qp) is the term-name for 
N(Qp). 


Proor. This lemma was proved for SAr in §11 of Chapter II. In L,Ar we 
construct the formula Q> as follows. 

If R(x) is a formula with one free variable, we call the formula 
R(N (R(x))) its diagonalization. Let diag : Z* +Z* be the partial func- 
tion 
the N-number of a formula with one free variable 

-> the N-number of its diagonalization. 


It is easy to show, using the results and methods in §4, that diag 1S 
computable. Thus, its graph is definable by a formula in L,Ar which can 


be explicitly constructed. We denote this formula by “y = diag x,” con- 
struct the formula R(x) : dy(“y = diag x” /\P(y)), and finally set: 


Ops AR (N ( aR (x))) = the diagonalization of R(x). 
By the definitions, we then have: 


Q> is true>the number of R(x) does not satisfy R(x) 
«>the number of the diagonalization of R(x) 
does not satisfy P(x) 
<>the number of Q, does not satisfy P(x). 


The lemma is proved. oO 
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We note that it requires a large amount of technical work to verify that 
“y = diag x” is definable in L,Ar, which is why we used SAr instead in 
Chapter II. 


7.4. The arguments of Tarski and Gédel now take the following parallel 
form: 
Tarski: 


(a) Suppose that truth is definable by a formula P. 

(b) Then there is a formula Qp which says “I am not true.” 
(c) The formula Q, cannot be false (because of its semantics). 
(d) The formula Qp cannot be true (because of its semantics). 
(e) Therefore, truth is not definable. 


Godel: 


(a) Provability is definable by a formula P. 

(b) There is a formula Q, which says “I am not provable.” 

(c) The formula Q,p cannot be false (because of its semantics, since 
otherwise it would be provable, and hence true). 

(d) Therefore, Q, is true. 

(e) Therefore, Q, is not provable (because of its semantics). 


We note that, in the above paraphrasing of Gédel’s argument, part (c) 
explicitly uses the stipulation that only true formulas are provable. When 
Gédel’s paper appeared in 1931, specialists were very busy looking for 
finitistic proofs that the axioms of arithmetic are consistent, so that 
stipulating that D C T would have run counter to the spirit of the times. 
Therefore, in Gédel’s own original wording the argument looks somewhat 
different. This distinction is traditionally explained in great detail in all 
textbooks on logic. However, we shall be satisfied with remarking that, if 
D gz T, then D¥T, and the Incompleteness theorem is trivially true. But 
in that case we would be in such bad shape that we would no longer care 
about completeness or incompleteness. 

The main point we are interested in is the following: given any fixed 
conception of provability which leads to an enumerable (or even to an 
arithmetical) set D of provable true formulas, we can effectively construct 
a new formula which is true but not provable. We now define more 
precisely what we mean by “effectively.” 


7.5. Definition. A set F C Z* is said to be productive relative to a versal 
family {£,} of l-sets, if there exists a partial recursive function f such 
that: for all A @Z* with E, c F, we have k € D(f) and f(k) © F\ E,. 


7.6. Proposition. Under the conditions in 7.2, the set of numbers of true 
formulas is productive relative to the versal family { E,,} constructed in §8 
of Chapter V1. 
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Proof. To fix ideas, we shall work with the language L,Ar. We first 
construct an enumerable family {P,(x,)} of formulas with one free vari- 
able x, such that P, defines E,. To do this, we define a sequence of terms 
f[k] in L,Ar as in 8.1(a) of §8, Chapter VI, by setting 


ffakl=k=+(---+(1,1)--+), — & times; 
fl4k + 1] =x, 4, =the (kK + 1)st variable in LAr; 
flak +2)= + (f[ (4), f[ 2(4)])s 
flak +3) =-(f[n())},F[2())- 


We then write 


P. = 3x,(3x5- + (ax(f[4(0] =f[1(k)]))- >: ) 


It is easy to see, using the methods in §4, that the function kh» N(P,) is 
recursive. We next fix a translation of “y = diag x” and set 


Ry = xq 1 (Ox 41 = diag x1") A (Px (1), 
Op, = A(R. (N (7(R,)))). 
and finally 
f(k) = N(Qp, ). 


This function is computable because N(P,) is computable. By Lemma 7.3, 
it satisfies the condition 7.5 with T in place of F. O 


7.7. The concept of productivity gives us the following approach to the 
problem of exhausting T: we begin with the set Dy of formulas which are 
provable in the Peano axiom system AXy, we define Dy by a formula Po; 
we set Ax, = Ax) U {Qp,}; and we similarly construct D,, P,, and Ax, = 
Ax, U {Qp}, and so on. It follows from Gédel’s theorem that, as long as 
we do all this ‘ ‘uniformly effectively,” we cannot obtain all of T even after 
transfinitely many steps. However, S. Feferman has shown that, if we are 
willing to dispense with effectiveness, we can obtain all of TAr in this way. 
We conclude this section by formulating Feferman’s result, which gives 
unexpected and philosophically interesting information about TAr. We 
omit the proof and the technical details (see Feferman’s original article 
“Transfinite recursive progressions of axiomatic theories,” J. Symb. Logic 
27, No. 3 (1962), 259-316). 


7.8. Principles of extension. In the first place, in order to exhaust TAr it is 
not enough to add Gédel’s formula to Ax, at every step. There are many 
other ways of constructing intuitively true formulas which in various ways 
formalize “having faith in the axioms Ax;.” 
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Feferman, in particular, uses the following construction. Suppose that 
we have already constructed the axiom system Ax, (where a is an ordinal), 
and that the set of numbers of formulas deducible from Ax, is defined by 
the formula D,. For any formula P(x) with one free variable, we construct 
a formula B, which has the intuitive meaning: “if P(7”) is provable (from 
Ax,) for all term-names 7 of natural numbers, then Vx P(x) is true.” 
These formulas B/ must lie in T, and we can set 


AX,4;= Ax, U {Beall P}; 


AXg = U Ax,, if B is a limit ordinal. 
ac 
Here is a method for giving B/ explicitly. The function n}» N(P(A)) is 
computable as a function of n and N(P). We define its graph by a formula 
M(x, y, z), so that, for /, m,n GE Z* 


M( in mae /is the number of a formula P with one free 
>, NYAS TUES | variable x, and m is the number of P(7). 


We then set: 
Br=Vy Vz(M(N(P),y, z)=D, (»)) sx P(x). 


7.9. The problem of choosing D,. This is the subtlest part of the proof. 
Here it is crucial to show that D, exists when £ is a limit ordinal. 

Feferman shows how the D, can be constructed for a suitable countable 
sequence of ordinals with limit y not exceeding w,“*” so that the following 
result will be true. 


7.10. Theorem. All true formulas in LAr are deducible from U , cy Ka: 


Thus, suppose we have accepted the Peano axioms. Then, in order to 
attain the total truth in arithmetic, we must perform a transfinite sequence 
of acts of faith in our not having been led astray by the previous acts of 
faith. 


8 On the length of proofs 


8.1. The title of this section is taken from a short paper written by Gédel in 
1936. His article consists of a precise formulation and proof of the 
following qualitative assertions. 

Suppose we are given a formal language L together with some concep- 
tion of deducibility of a formula P from a (variable) set of formulas @. 
Suppose, in addition, that we are actually given a function which estimates 
the “complexity of deduction” of a formula P from the set @. (In 
languages of £,, this “complexity” could be the minimal size of a deduction 
of P from @, i.e., the number of signs of a fixed finite protoalphabet 
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needed for such a deduction; note that the use of the word “complexity” 
here has nothing to do with the Kolmogorov complexity in §9 of Chapter 
VI.) We further assume that L contains a certain fragment of the logic of 
,, that L and @ are rich enough for the incompleteness principles to take 
effect, and that the “complexity of deduction” satisfies certain natural 
axioms. We then have the following facts: 

(a) There exist formulas deducible from @ whose deduction is arbitrarily 
more complex than the formula itself. 

Observation shows that this somewhat vaguely defined class includes, if 
not the most important, at least the most “prized” mathematical facts. 

(b) If we add any independent formula A to the axioms @, then we can 
find formulas deducible from & whose deduction from @ U {A} is arbitrarily 
less complex than from & (the principle of cutting down proofs). 

Compare with the great strength of “analytic” methods in comparison 
with “elementary” methods in number theory. 

The following more precise presentation of these ideas is based on a 
short article by Ehrenfeucht and Mycielski in Bull. Amer. Math. Soc. 17, 
No. 3 (1971), 366-367. 


8.2. We consider the following set of data. 


(a) A countable alphabet A with a fixed numbering N : A—>Z*. 

(b) A subset F Cc S(A) whose elements are called formulas. 

(c) A partial function D : ?(F)—>@(F) which to certain subsets @ C F 
corresponds sets ©) (@) of formulas “deducible from @.” We shall often 
write @| P instead of P € ")(@). 

(d) The complexity of deduction: this is a function Cd (P) which is defined 
for pairs @ Cc F, P © )(@), and takes values in Z*. It is convenient to 
take Cd,(P) = 00 if P ED(@). 


We impose the following conditions on this data: 


8.3. (a) A contains —, >, (, and ). 

(b) If P and Q € F, then —(P) and (P)->(Q) € F. As usual, we shall 
write P—> Q instead of (P)—>(Q), and so on. 

(c,) @c D(A); if @ CA’ and | is defined at @, then °) is defined at 
@’ and D(@) Cc D(a’). 

(cr) If @ U{P}EQ, then @- PQ. 

(c;) @/ P>( 7 P— Q) for any P, Q EF. 

(do) If @ C@’, then Cdg(P) < Cdg (P). 

(d,) The set {(P, n>|Cdg(P) < n} C S(A) X 2? is decidable. 

Condition (d,) does not actually have to hold for all @ Cc F, but we shall 

only consider those @ for which it is true. In the case when Cd,(P) is the 
size of the shortest °,-deduction of P from @ in a finite protoalphabet, and 
@ is a decidable set of axioms, (d,) holds for the following reason. We can 
write down all the texts in A having size < n—there are a finite number of 
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them—and then verify for each one in turn whether or not it is a deduction 
of P from @. 

(d,) There exists a general recursive function f(x,y, z) which is nonde- 
creasing in x, such that 


Cdeyces(Q) < f(Cda(P>@), N(P), N(Q)) 
for all OQ © %(@). 


Both sides of this inequality are finite because of the previous condi- 
tions: since @}-Q, it follows by (c,) that @ U {P}}Q, and then by (c,) that 
@+P—Q. We have an estimate of the type in (d,) in languages of £,, 
because, starting with any deduction of PQ from @, we can obtain a 
deduction of Q from @ U{P} by simply adding P and Q (by modus 
ponens). This increases the size of the deduction of P— Q by the sizes of P 
and Q. 

(d,) There exists a general recursive function g(x, y) such that 


Cdg(P—>(7P>@)) < g(N(P), N(Q)). 


In languages of £,, the formula P-+( 4P— Q) is a logical axiom, and, 
if @ contains this axiom, then the deduction has length | and size equal to 
the size of the formula itself. Of course, the size of this formula can be 
represented in the form g(N(P), N(Q)). 

We now formulate Gédel’s theorem on “cutting down proofs.” We 
suppose that the conditions and conventions in 8.2-8.3 are fulfilled. 


8.4. Theorem. 
(a) Suppose that @ C F and X(@) is undecidable. Then for any general 
recursive function | there exist infinitely many formulas P © °)(@) such 
that 


Cdg(P) > 1(N(P)). 


(b) Suppose that @'=@U {A} and the formula A has the property 
that D(@ U { 4A}) is undecidable. Then for any general recursive func- 
tion r there exist infinitely many formulas P © )(@) such that 


Cde(P) > r(Cde(P)). 


PROOF. 
(a) If the first assertion were false, then for a suitable / and for all 
P € %)(@) we would have Cd,(P) < 1(N(P)). But then the set 


)(@) = { P|\Cdg(P) < 1(N(P))} C S(A) 


would be decidable by (d,), since it is obtained by applying a bounded 
universal quantifier (in 7) to the decidable set in (d,). This contradicts the 
assumption. 
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(b) Let PE D(@ U { 7A}). By (d,) we have 
Cdey a4j(P) < f(Cdg( MA > P), N( A), N(P)). 


If we now suppose that the second assertion of the theorem were false, 
then for a suitable nondecreasing general recursive function r we would 
obtain: 


Cde( AA > P) < r(Cde( A> P)), 
or, by (d,) and (d,): 
Cde( 4A P) < ref(Cde(A >( 74 > P)), N(A), N(P)) 
< rof(g(N(A), N(P)), N(A), N(P)). 


Substituting this in the above inequality for Cdg.,, ay?) for fixed A we 
obtain an estimate of the form 


Cdeyi aay(P) < UN (P)), 


where / is general recursive and P € %)(@ U { 74}). But this contradicts 
the assumption that )(@ U { —A}) is undecidable by the first assertion of 
the theorem. oO 
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Recursive groups 


1 Basic result and its corollaries 


1.1. We consider a countable “group alphabet” 
AA {in Gina d Gy sz a0} 


The expressions in the alphabet A, including the empty expression @, are 
traditionally called words. The word a, - - - a; (m > | times) will be written 
a"; the word a~!---a,' (m>1 times) will be written a”; and we 
agree to take a2=@. We call a word aj"'- + + a” reduced if either it is 
empty or there are no subwords of the form a; 'a, or a,a;-' when it is 
written in expanded form. 

The operation of “joining and reducing” (by “reducing” we mean 
crossing out all subwords of the form a,a;' or a, 'a,) defines a group 
structure with unit @ (which we sometimes denote by 1) on the set of 
reduced words. This is a free group F with a countable set of generators 
{a@,,...,4,,... }. We can also consider nonreduced words as elements in 
F: we identify such a word with the word obtained by reducing it. 

We have a canonical numbering on A: N(a,) =2i, N(a,')=2i-1. 
All properties related to the computability of operations and the enumera- 
bility of subsets in A and S(A) will be considered relative to any number- 
ing of A equivalent to N and any numbering of S(A) compatible with N 
(see the definitions in §1 of Chapter VII). We shall continually be making 
use of the following facts. 


1.2. Lemma. 
(a) The set F of reduced words is decidable. 
(b) The group operations in F are computable. 
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(c) A subgroup G C F in enumerable in S(A) if and only if it has an 
enumerable set of generators. 

(d) A normal subgroup H CG in an enumerable subgroup G C F is 
enumerable if and only if it is generated as a normal subgroup by an 
enumerable set. 

(e) A homomorphism F — F is recursive if and only if the induced map 
{@),...,4,,... }—F is recursive. 


The proof is a good exercise in using the techniques of Chapter VII, and 
we leave it to the reader. It is convenient to begin by showing that the 
operation of reducing is computable; the rest goes through more or less 
automatically. 


1.3. Definition. A group is called recursive if it is isomorphic to a quotient 
group of the form G/H, where G C F is an enumerable subgroup and 
H Cc Gis an enumerable normal subgroup. 


Here we could limit ourselves to subgroups G C F which are generated 
by an enumerable subset of the standard generators {a,,...,4,,... }. 


1.4. REMARKS AND EXAMPLES. 


(a) Recursive groups have at most countably many elements. 

(b) Finitely presented (f.p.) groups, i.e., those which have a finite number of 
generators and relations, are recursive. In particular, finite groups and 
finitely generated (f.g.) abelian groups are recursive. 

(c) A subgroup H of an f.p. group G is not necessarily f.p. (or even f.g.). 
But, if it is finitely generated, then it is recursive. 


In fact, let {f,,...,4,,} be generators of H. We add generators 
{Ins +++» 4, } Of the group G which are connected by a finite number of 
relations, and we define a homomorphism ¢ : FG by setting $(a,) = h, if 
i<nand $(a)=1 if i>n. The kernel E of ¢ is generated by a finite 
number of relations between the a,,...,a, and by the set 
{Qn41> G42 --- }- Hence E is enumerable by Lemma 1.2(d). The subgroup 
HCF generated by a,...,a,, is also enumerable, by Lemma 1.2(c). 
Therefore the set H 1 E is enumerable. But ¢ induces an isomorphism 
H/H q E+ H. Consequently, H is recursive. oO 


The basic aim of this chapter is to prove the following remarkable 
theorem of Higman, which gives the converse of the simple assertion 
1.4(c). (G. Higman, Subgroups of finitely presented groups, Proc. Royal 
Soc., Ser. A, vol. 262 (1961), 455-475.) 


262 


1 Basic result and its corollaries 


1.5. Theorem. 


(a) Any recursive group G/H (in the notation of 1.3) can be embedded in 
a suitable f.p. group F/N. 

(b) This embedding can be made effective, i.e., it can be induced by a 
suitable recursive map G— F. 


Here are some applications of this theorem. 


1.6. Corollary (Universal finitely presented groups). There exists an f.p. 
group U such that any f.p. group G can be embedded in U (and hence, any 
recursive group can be embedded in U). 


In fact, any f.p. group is isomorphic to the quotient of F by a normal 
subgroup which is generated by a finite set of reduced words in F and by 
all a, with i > n for some n. We let J c S(S(A)) X Z* be the decidable set 
of pairs <a finite sequence of reduced words, >, and we let N, (for i € J) 
denote the corresponding normal subgroup. We construct the “doubly 
infinite” group alphabet {a,, a,,'| j,k > 1}, we identify 7 with Z* by 
choosing a recursive numbering of J, and we define the group U, which 
has generators {a,} and relations “N,, written in the alphabet 
{@1, Gz, ... }.” It is clear that Up is recursive. It will also be clear from the 
results in the next section that U, is the free product of all the groups 
F/N,, so that any f.p. group can be embedded in Up. Thus, any f.p. group 
U in which we can embed U,, using Higman’s theorem, is universal. oO 


In M. K. Valijev’s article Examples of universal finitely presented 
groups, Dok. AN SSSR, 1973, vol. 211, No. 2, a universal group U is 
constructed which has 14 generators and 42 relations, and it is mentioned 
that such a group can be constructed with only 2 generators and 27 
relations. 


1.7. F.p. groups with algorithmically undecidable word problem. 
Let G be the group with four generators a, b, c, d, and with the relations 


b-™ab™=d~"cd", forallme E, 


where EF C Z* is an undecidable enumerable set. It easily follows from the 
results in §2 that the equation 


b~*ab* = d~*cd* 


only holds in G if x € E. (In fact, the elements bab” for m > | generate 
a free subgroup of G, so that G contains the free product of the subgroups 
generated by {b “ab*|x > 1} and by {d “*cd*|x > 1} with amalgamation 
{b-*ab* = d~*cd*|x € E}.) Hence, the question of whether or not the 
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equation b~*ab* = d~*cd* holds is undecidable (as a mass problem in- 
dexed by x), and, if we embed G effectively in a f.p. group, we may 
conclude that the word problem is unsolvable in this f.p. group. 

The existence of such groups was first established by P. S. Novikov and 
W. Boone. 


1.8. “Natural” recursive groups. We find many examples of recursive 
groups which are not a priori finitely presented in algebraic geometry over 
algebraic number fields. We shall limit ourselves to one typical example. 

Let ©, (Q) be the orthogonal group of automorphisms of an n-dimen- 
sional linear space L (over the rational numbers Q) together with a 
quadratic form f. Let b be the corresponding bilinear form. The symmetry 
T, © ©, (Q) is defined for any vector x € L with f(x) #0: 


b(x, y) . 
f(x) 


for all y € L. The involutions 7, € ©, (Q) give us an enumerable system of 
generators of ©, (Q), and all the relations are generated by the enumerable 
(indeed, decidable) system of relations 


T,(y) aN ies 


a 


=1, (1,1,7,) = 1, for all coplanar {x, y, z} 


(S. Becken). 

The numbering of L = Q” implicit here is taken to be compatible with 
any numbering of Q which is compatible with the standard numbering of 
Z* and in which the field operations are computable. 


1.9. Higman’s theorem is related to the theorem that enumerable sets are 
Diophantine (Chapter VI), although it was first proved earlier than the 
latter result. Perhaps both facts are special cases of some general assertion 
about recursive algebraic structures. 

In any case, the theorem on the Diophantine nature of enumerable sets 
can be used to simplify considerably the recursion-theoretic part of 
Higman’s proof. This was shown by Valijev, whose construction will be 
given in §§5-6 (cf. Algebra i Logika, vol. 7, No. 3 (1968)). §§2-4 will be 
devoted to the group theoretic preliminaries; here we shall follow Higman. 


2 Free products and HNN-extensions 


2.1 Suppose we are given a family of groups (G,), i € J, and a family of 
group homomorphisms a, : A—>G;. We consider the class of families 
(H, B,) of homomorphisms £, : G;—> H such that Bea; : A->H does not 
depend on i€/. This class contains a universal family $, : G;— * 4G, 
which is unique up to isomorphism: any other family (H, f;) uniquely 
determines and is uniquely determined by the homomorphism y : *4G,— 
H for which £; = y°49;- 
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In what follows we shall only need the case when all the a, are 
embeddings. In this case *,G, is called the free product of the groups G, 
with amalgamated subgroups a,(A) Cc G,;. We shall generally denote the 
structure maps G, > *,G, by ¢,, perhaps with additional indices. We let 
denote the structure homomorphism ¢,°a; : A—*,G,, which does not 
depend on i. If A = {1}, we write simply *G, instead of *,G,; if the set of 
indices is {1,..., }, we write G,* - - - *G,, and so on. We shall continu- 
ally be making use of the following structure lemma. 

Let a; : AG, be embeddings, and let S; Cc G, be subsets such that 


G,\a(A)= J a,(A)s, and 


sES; 


a,(A)s,#a,(A)s,, for s, #5, €S,. 


2.2. Proposition. Any element in the group *,G, can be uniquely represented 
in the form 


$(a)6;, (51) + + + ,(5,), 


where a © A, 5, ES, , i; #i;4, for all j, and n > O depends on the element. 


io 
We shall call this the canonical expansion of an element. 
For the proof of this fact and for further details, see, for example, 
Serre’s lecture notes Arbres, amalgames et SL). 


2.3. Corollaries 


(a) Under the conditions in 2.2, the structure homomorphisms $ and ¢, are 
embeddings. 


This allows us to identify A and G, with subgroups of *,G; using @ and 
¢;. We shall do this in the statements that follow. However, in the 
several-step constructions in the later subsections, one and the same group 
will be embedded in another group in many different ways using various 
compositions of the structure maps, and it will be necessary to keep careful 
track of these embeddings. 


(b) GG, =A (in *4G) for i#j. 


In other words, $,(G;) 1 $,(G,) = (A). We can use Proposition 2.2 to 
prove C: otherwise we would have ¢,(s;) = ¢,(s,), which would contradict 
the uniqueness. 


(c) Suppose we are given a family of embeddings B,: H,—>G, and a 
subgroup B CA such that B,(H;)M a,(A) = a,(B) for all i. Then the 
composition a 

B; 9a; $;°B; 


B=> H,> *,G 
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does not depend on i, and therefore gives a canonical map *,H,—> 
* ,G,. This map is an embedding. In particular, the subgroup of * 4G, 
generated by $,° B,(H,) is isomorphic to * ,H,. 


In fact, the canonical expansion in 2.2 of an element in *,H,; goes to 
the canonical expansion of the image of this element in *,G,. 


(d) With the same notation, we have 
(#,H,) A= Bin *,G,; 
(#,H,) 9 G =H, in *4G,. 


2.4. Generators and relations. Let M be a set, and let R be a subset of the 
free group Fy, which is freely generated by M. We let |M : R| denote the 
quotient group F,,/R, where R is the smallest normal subgroup of Fy, 
containing R. This is what we mean by defining a group by generators (M) 
and relations (R). 

We shall take the following liberties with notation: 

(a) If M has a nonempty intersection with a group that has already been 
defined, then all relations coming from the relations in the earlier group 
are assumed to be included in R, even if they are not explicitly written out. 
We might completely omit any reference to R if there are no other 
relations besides those coming from the earlier group. For example, if E 
and F C G are two subgroups, then |E U F| is the subgroup they generate 
in G, and so on. 

(b) Instead of writing, say, a,ay' is in R, we may write a, = a). 


EXAMPLE. If the a, : A> G, are embeddings, then *,G, is defined by the 
following generators and relations: 


U G,: a,(a) = a,(a) for alla € A, i,j EI. 


rel 


We now introduce a construction which will be fundamental for every- 
thing that follows (G. Higman, B. Neumann, H. Neumann). 
Suppose we are given two embeddings of groups a, 8B : A>G. 


2.5. Definition. The HNN-extension of the group G (relative to A, a, 8) is 
the group 
K=|GU {t}: 27 'a(a)t = B(a) for alla € Al. 


2.6. Proposition. The following homomorphisms are embeddings: 


(a) GoK : gb» the class of g modulo the relations in K. 
(b) G*,t~ 'Gt— K, where the free product is taken relative to the embed- 
dings aby B(a) and ate t7'a(a)t. 
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Proor. In the group G* {u"}, the subgroup U generated by G and 
u~'a(A)u is isomorphic to G * u~'a(A)u. In fact, the canonical expansion 
of an element in G *u~'a(A)u has the form g,u~'a(a,)ug,--- g,u! 
a(a,)u, where g,EG, g,...,g8,6G\{1}, a,...,4,-,EA\ {1}, a, €A, 
and so this expansion also has the canonical form in G * {u”}. 

We construct the subgroup V = G * vB (A)vo~'C G * {v"} similarly. 

We identify the group W = G * w'Aw with U and V by means of the 
isomorphisms which are the identity on G and take w~'aw to u~'a(a)u 
and vB (a)v—', respectively. 

We now consider the group (G * {u"})*,(G* {v"}). The group 
G Cc Wis canonically embedded in it, and for all a € A the element ¢ = uv 
satisfies the relation 


1 'a(a)t = B(a), 


because we have made the identification u~'a(a)u = vB(a)v'. In addi- 
tion, it is clear from Proposition 2.2 that in (G * {u"})*,(G * {v”}) the 
groups u~'Gu and vGu—' generate a free product with amalgamation A 
embedded by means of the maps a}> u'a(a)u and ap vB(a)o7!, 
respectively. Hence, if we conjugate by v, we see that G and r7~'Gt also 
generate a free product, as described in the statement of 2.6. 

Therefore, the subgroup 


K'=|Gu {t=u°}| c(G* {u"})# AG «* {v"}) 


is a homomorphic image of K, and assertions (a) and (b) hold for Kk’. 
Moreover, the canonical map K->K’ is an isomorphism. To see this it 
suffices to note that there exists an isomorphism 


K+ {0"} (G+ {u"})* 1 (G* {0"}) 


which takes t€ K to wv. In particular, ¢ has infinite order in K. The 
proposition is proved. 0 


We shall need to refine and generalize this result in two directions. In 
the first place, we want to consider iterated HNN-extensions; in the 
second place, we are interested in the connection between HNN-exten- 
sions of a group and a subgroup. We now bring together all the facts we 
need into a single statement. 

Suppose that we are given an entire family of pairs of embeddings 
a,, 8, : A;x>G (iE 7) and a subgroup HCG with the property that 
a '(a,(4,) H) = 8 '(B(A) 0H) = B, C A; are subgroups. Under 
these conditions we have 


2.7 Proposition. Let 
Kg =|GU {1 
Ky =|H V(t 


i€1}: 6, 'a,(a)t, = B,(a) for alli€I,a € Aj; 
i€1}:47'a,(b)t/ = B,(b) for allie 1, bE B, 
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Then 


(a) the {t;} freely generate a free subgroup in K;; 
(b) the natural maps GK, and Ky, — Kg (the latter given by tj ,) 
are embeddings. In addition, K,, 1 G= H in Kg. 


PRoor. 

(a) If the relations in Kg implied a nontrivial relation between the ¢,, this 
relation would be preserved in the quotient of K, by the smallest normal 
divisor containing G. But in this quotient the relations ¢— la, (a)t, = B;(a) 
become trivial (1 = 1), and no restrictions are imposed on the images of the 
t,. This proves (a). 

(b) We first consider the case when / consists of one element. In the 
notation used in the proof of Proposition 2.6, we consider Kg as a 
subgroup of (G * {u"})*4(G * {v"}). By Proposition 2.2, in G * {u"} 
we have 


H«{u"\QGeu'a(Aju=Heu ‘a(B)u, 
and, similarly, in G * {v"} we have 
H«* {o"} 1G * vB (A)o"!| =A + 08 (B)o 
The above identifications of U and V with W identify these intersections 
with the subgroup 
W,=H+*w 'BucGs+w 'Aw= W. 


By Corollary 2.3(c), we have a canonical embedding 
(= {u"))#y (He (o"))o(G* (u"))#,,(G* (0"}). 


But, as at the end of the proof of 2.6, the group on the left is Ky, * {v"} 
and the group on the right is K, * {v”}, so we obtain an embedding 
Ky — Kg. 

Furthermore (the intersection is taken in (G * {u"})*y(G * {v"})): 


(H * {u"})* wilt «#{v"}) 0 Geu la(Alu= HH *ula(B)u, 


so that, if we now intersect with G, we obtain H. It follows a fortiori that 
Ky G=H. 

We prove (b) for finite 7 by an easy induction on n, and then for infinite 
I by passing to the inductive limit (which here is a union). We leave the 
details to the reader. oO 


3 Embeddings in groups with two generators 


In this section we prove a result which will be used later and which shows 
vividly in a simple situation how the number of generators can be de- 
creased using embeddings. 
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3.1. Proposition. 


(a) Any countable or finite group G can be embedded in a group with two 
generators. 
(b) If G is recursive, then there is such an embedding which is recursive. 


PROOF. 
(a) The group Z* Z= {b"} * {v"} has a free subgroup of countable 
rank, for example, 


S=|({b~‘eb'|i > 0}]. 


It immediately follows from Proposition 2.2 that there are no relations 
between the generators b ‘vb’. 

Thus, if G is a free countable group, it embeds in Z * Z. If G is not, we 
could try to represent G in the form F/N, where F is countable and free, 
then embed F in Z * Z and consider the induced homomorphism F/N > 
Z+*Z/N’, where N’ is the normal subgroup in Z * Z generated by N. 
Unfortunately, VN’ F may be strictly larger than N, so that this homo- 
morphism does not have to be an embedding. The following construction 
shows how to deal with this problem. 


Let { 21, 22, 23,...} be a countable system of generators of G, where 
&; #1. We successively construct the following extensions of G: 
(1) G * {u"}; 


(2) the HNN-extension of G * {u"} 
IG * {u"}U {t|t7 ‘ut; = ug, i= 1,2,... }, 


(note that u and the ug; generate infinite cyclic subgroups in G * {u"}); 

(3) the free product P of this HNN-extension and the group 
{b"} * {v"} with subgroups |{¢,, 4,,...}| and |{b~‘vb‘|i > 1}| amalga- 
mated by means of the isomorphism 


t=b-‘vb', iF. 


(4) P has the two rank 2 free subgroups |{b, v}| and |{u, b}|. There are 
no relations between u and b because there can be no relations in the 
quotient by the smallest normal subgroup containing G, ¢,, and v. 

Finally, we construct the following HNN-extension of P: 


Q=|P U {a}: a7'ba=u,a™~'va= DI. 


To complete the proof, it remains to verify that Q is generated by the 
elements a and b. 

In fact, Q has the obvious system of generators { g,, t,(i > 1); u, 0, 
a, b}. The relations g, = u~'s,~ ‘ut, allow us to eliminate the g,; the relations 
t; = b~‘vb' allow us to eliminate the 1; and the relations u =a ‘ba and 
v = aba™' allow us to eliminate u and v. This proves the first part of the 
proposition. The following analysis of the construction establishes part (b). 
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If we express g, in terms of a and 6} in Q using the above relations, we 

find that g, = e, modulo the relations in Q, where 

e, = a™~'b~'ab~'ab~'a~'b'a~ bab ‘aba ~ |b’. 
Hence, the subgroup E = |{e,|i > 1}| in the group {a"} * {b"} has the 
following remarkable property: any normal subgroup N C EF generates a 
normal subgroup N’ in {a"} * {6"} such that E 4 N’ = N (compare with 
the remark at the beginning of the proof). 

In particular, if { g,} is an enumerable system of generators of G which 
is connected by an enumerable set of relations, it follows that the map 
g;t» e, (mod the relations) induces a recursive embedding of G in the 
recursive group E/N’, since N’ is enumerable whenever N is. O 


4 Benign subgroups 


4.1. Definition-Lemma. Let G be a finitely presented group, and let H C G 
be a subgroup. H is called benign if the following equivalent conditions are 
Julfilled: 


(a) There exists a finitely presented group K, a finitely generated subgroup 
LC K, and an embedding G C K such that GQ L= H. 
(b) The HNN-extension 


Kg =|G U {t} : t-'ht =h, for allh © H| 


can be embedded in a finitely presented group. 
(c) G*,,G can be embedded in a finitely presented group. 


PROOF OF THE EQUIVALENCE 
(a)=(b). Suppose that G C K and L satisfy (a). Then it follows by 2.6 
that K,-is embedded in the HNN-extension 


|K U{t}:¢7' =1, for all/ € L]. 


This group is finitely presented: we add ¢ to the generators of K, and add 
the relations ¢~'/.1 =/, for a finite system of generators {/,} of L, to the 
relations between the generators of K. 

(b)=>(c). The group G*,,G is embedded in K, by 2.6(b), and K, can be 
embedded in an f.p. group because we have assumed condition (b). 

(c)=>(a). Suppose that G*,, G C M, where M is finitely presented. We 
set K= M, we set L=the image of G under the composite embedding 
2, : G+ G* ,, G— M, and we embed G in K by means of 9, : G>G*,, G 
— M. Since $,(G) 0 $,(G) = H, we have GM L = H in K, as required. 1] 


The basic goal of this section is to reduce Higman’s theorem 1.5 to 
proving that all enumerable subgroups in Z*2Z are benign. For this 
purpose and for later uses we shall need the following lemma. 
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4.2. Lemma. Let R be a benign subgroup of an f.g. free group F, and let R be 
the normal subgroup it generates. Then F/R can be embedded in an f.p. 
group. 

Proor. Let i be an embedding of F*, F in an f.p. group K (see 4.1(c)), and 


let $,, 6, : F> F*,F be the structure maps. We consider two embeddings 
of Fin K xX F/R: 


a: fis ied(f), fR>; 
B: ft ico (f), 1. 


They obviously coincide on the subgroup R Cc F. Hence they are induced 
by a homomorphism 


yi F* F3KX F/R, 


which has a trivial kernel, since the composition of y with the projection 
onto K coincides with i. 

We construct an HNN-extension which takes i x {1} : F*¥p,F—-K X 
F/R to y: 


L=|K xX F/RU {1} : 7 'Cieg (Ff), It = (ied (F) IR ), 
t~'Ciegs(f), 1)t = Ciegs(f), 1) for all f FJ. 


L obviously contains F/R. We show that L is finitely presented. 

Generators of L : {t} U finite system of generators of K U finite system 
of generators of F. This system is finite. 

Relations in L : 

(a) {the relations between the generators of K}. 

(b) {the commutation relations between the generators of K and the 
generators of F}. 

After imposing these relations, we may consider that we are working in 
KX F. 


(c) tied (f), Dt = <iee (ND, 
tied f), I>t = Cicg(f), I>, 


where f runs through the system of generators of F. 

(d) The relations in R between the generators of F. 

We can take the system of relations Ry = (a) U (b) U (c) to be finite. We 
need only verify that the relations in (d) follow from Rp. 

Let R’ Cc F be the normal subgroup generated by Ro, 1.e., the kernel of 
the natural homomorphism F->|K U F U {t}: Ro|. We want to show that 
R’'= R. The inclusion R’ c R is obvious. We verify the converse. 

If f © F, we set f’ = f mod R’ and f, , = ied, (f) © K. It then follows 
from the relations (b) and (c) that in K x F/R’ we have 


tf, Dt= Ch D<L SD and 1 '(f, Dt = hy D. 
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On the other hand, if f ER, then, since F*, F is embedded in K, it 
follows from the relations (a) that f, = f.. Hence f’=1,sothatR CR’. 


This lemma gives us the following reduction step. 


4.3. Proposition. Jf all enumerable subgroups in Z*Z are benign, then 
Higman’s theorem is true. 


PRoor. Let G be the free group generated by an enumerable set of free 
generators { g,},i=1, 2, 3,..., and let N C G be an enumerable normal 
subgroup. We shall show how to embed G/N into an f.p. group. 

We first consider the embedding G->{a"} * {b"} given by g,b> e, 
where the e, are as defined at the end of §3. Let the image of N under this 
embedding generate the normal subgroup N’ Cc {a”} * {b"}. By the re- 
mark at the end of §3, G/N embeds in {a"} + {b"}/N’. But N’ is 
enumerable by Lemma 1.2(d), since it is generated by the image of an 
enumerable set under a recursive map. Therefore, N’ is a benign normal 
subgroup. Lemma 4.2 then shows that {a"} + {b"}/N’ can be embedded 
in an f.p. group. | 


We conclude this section by establishing several basic properties of 
benign subgroups. 


4.4. Lemma. Let E, F C G be benign subgroups of G. Then: 


(a) EO F is a benign subgroup; 
(b) |E U F| (“the sum of E and F in G”) is a benign subgroup. 


Proor. Let $, 6: G+G*,G and ¢$), 6): G>G*,G be the structure 
homomorphisms. Let M, and M, be f.p. groups such that G*,-G C M, and 
G*-G C M,. We identify ¢,(G) C M, and ¢\(G) Cc M, with G, and con- 
struct the group M,*,,M). This group is finitely presented (since it suffices 
to add to the relations in M, and M, the relations $,(g,)=¢)(g,) for a 
finite system of generators of G). Let $7, $3 : M,, M, >M,*,M, be the 
structure embeddings. We set K = M,*,~M, and L=$¢/°,(G), and we 
embed G in K by means of $5 °¢}. 

We claim that GQ L=E QQ F (as a subgroup of G in K). In fact, 
$/(M,)N ¢3(M,) = G with its canonical embedding in M,*,M). If we 
only take $,(G) in M, and $5(G) in M,, then intersecting with the 
amalgamation G gives E and F, respectively, and intersecting $,(G) with 
0,(G) gives EO F. 

(b) The subgroups ¢,(|E U F|) and $,(G) have the same intersection 
with the amalgamation in G*,G, since they actually contain it. Hence, by 
2.3(d), we have |$,(|E U F|) U ¢:(G)| 1 $\(G) =|E U F| in G*,G, ie., 
since E is the amalgamation, 


lo, (F) U$2(G)|NO(G) =|E U FI. 
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Similarly, we have 


|e (EZ) U4i(G)/N4i(G) =|E UF 
in G*,G. The notation is compatible with the fact that these two intersec- 
tions are identified in the amalgamation of the product M,*,M,, which is 
constructed as in part (a). 
Applying 2.3(d) to this product, we find that 


ly (lo. (F) U $2(G)|) 095 (lb (E) U $4(G))|M G =|E U FI. 


But the group |¢/°¢,(G) U $3°95(G)|M G obviously contains the right- 
hand side and is contained in the left hand side of this equality, so that it 
also coincides with |E U F]. 

Finally, |p/°$,(G) U $7°95(G)| is a finitely generated subgroup of the 
finitely presented group M,*,, M,. The proof is complete. Oo 


4.5. Lemma. Let G and H be f.g. subgroups of f.p. groups. Then any 
homomorphism from G to H takes benign subgroups of G to benign 
subgroups of H. 


PROOF. 

(a) If ACG is benign, then A X {1} C G X H is also benign, since, 
given an embedding of (G, A) in (K, L) as in 4.1(a), we can construct the 
obvious embedding of (G X H) in (K X M, L X {1}), where M is the f.p. 
group containing H, which also satisfies the conditions in 4.1(a). 
Conversely, if A x {1} C G X H is benign, then from an embedding of 
(G X H, A X {1}) in (K, L) as in 4.1(a) we construct the corresponding 
embedding of (G, A) in (K, LN G X {1}). 

(b) Now let ¢ : GA be any homomorphism, let F be its graph, and 
let A C G be a benign subgroup. Then in G X H we have: 


{1} x ¢(4) =|(A x {I} U (1} x A[N F)UG x {1}|[n (1) x B. 


It is clear from the assumptions regarding G and H that F is a benign 
subgroup in G X H. By part (a), the other subgroups on the right in the 
formula are also benign. By Lemma 4.4, {1} < $(A) is a benign subgroup. 
Hence, $(A) is also benign. [el 


5 Bounded systems of generators 


5.1. Let G’=|{a,,...,a@,}|, 1 > 1, be the group freely generated by the q,. 
We call a subset R’ C G’ bounded if there exists an r . : such that any 
element in R’ can be represented in the form ay +++ ax, x,€2. In this 
section we prove the following special case of the lrscthes of Proposition 
4.3: 


5.2. Proposition. If the subgroup H’ Cc G' is generated by a bounded enumer- 
able subset R’ C G’, then it is benign. 
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Corollary. The same is true if G’ is an f.g. subgroup of an f.p. group (using 
Lemma 4.5). 


In the next section we show how the general case follows from this 
special case. 
The proof of 5.2 consists of a series of reduction steps. 


5.3. First reduction. In the free group G = |{a), by, C13 * + * 3@ins Ons Con} | WE 
shall consider a set of “layered” words of the form 
R= {aybycf" te reid LOL ei 

and the subgroup H C G it generates. We shall later show that, if R is 

enumerable, then H is benign. This is a special case of 5.2 to which the 
general case reduces using the following technique. 

Suppose we are given G’ and R’ as in 5.1. For each element g’= 

a+ - - a & R’ we construct an element g € G as follows. We represent 


1 


g’ in the form 
n n 
x 7 X23 soe oe Xi 
Il a; 1 Il qa; 2, I az”; 
i=l i=l i=l 


where 


xX, fori=i,, 
XK, 5 = . : 
0, foriFx iy. 


We then set 


n n 
pan Xi Xi x2; XQ; 
e-( I abc" | I ays) 


i=] i=l 


n 
aioe ( II ai naar ona fiom 
i=l 
If R’ is enumerable, then the set R of all elements g obtained from all 
the g’ € R’ is enumerable. 
We consider the surjective homomorphism ¢ : GG’ given by $(4,;4;) 


=a (l<i<n0<j<r-)), 6) =¢(¢)=1 for all i=1,..., 7. 
Clearly #(R) = R’, and hence ¢(H) = A’. It then follows from Lemma 4.5 
that, if R is benign in G, then R’ is benign in G’. O 


5.4. Using the theorem that all enumerable sets are Diophantine. 
From this point on, we fix a pair (G, enumerable R), as in 5.3. We shall 
write / > 1 in place of rn. We define the set E C Z'+! by the condition 


R= {agvocg?- - > aftbjcf"|\C xq, ..., x EE}. 
It is not hard to see that R is enumerable if and only if E is enumerable. 
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We now show that E can be represented as the projection onto the first 
1+ 1 coordinates of a set 
N 
(1) £,cZt!xz"-'"  m>1+2, 
s=l 


where each of the E, is defined by an equation of one of the following forms: 


X,=C, ceZ; 
X; = Xjs 0<ij<m; 
Xe =X +X, l+1<k<jcicm; 
Xp = Xj°Xiy l+1<ek<j<icm. 
In fact, let e9,...,¢ €{1, — 1}, and let € =e, ..., &>. We consider 


the enumerable sets 


Ei = (xq. x (Rt U (0})!' "Kear «+ xD EE}, 


By the fundamental theorem in Chapter VI, there exist polynomials Pé 
with integral coefficients such that 


E* = the projection of the 0-level of P® in (Z* U {0})'*! x 
(Z*)"~‘ onto the first / + 1 coordinates (x9, ..., x). 


Here we can take n large enough so that the sets of variables which 
actually occur in P* and in P®’ and which “drop out” in the projection do 
not intersect if & # €”. If we add the (n + 1)2'*? new variables Vig OCIK< 
n, j = 1, 2, 3, 4) to the variables which drop out in the projection, we find 
that E can be represented as the projection onto the first / + 1 coordinates 


of the 0-level of the following polynomial, where the 0-level is now in 
Zi+L yg gat (nt N2'3— 1. 


: 2 
o=|] (Pp: CS ee Neagetnete)) 
é 


/ % 2 . , 2 
+ D(os- 303) + D (s-1- 3a} | 
j=l 


i=0 i=/+1 j=l 


Finally, in order to represent the set Q=0 as a projection of an 
intersection )*_,E, of the required type, we introduce additional variables 
as follows. Let xo,..., x, be the variables which occur in Q. Instead of 
Q =0 we write Q, = Q,, where Q, is the sum of the monomials in Q with 
positive coefficients, and Q, is the sum of the monomials with negative 
coefficients. Then 


O-level of Q = a projection of (x,4) = Q1) 1 (X42 = Q2) (X41 = X42): 
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If Q, and Q, are constants or variables, this gives us the desired repre- 
sentation. Otherwise, we write, say, Q, in the form Q; + Q/' or Q7- 
and, after introducing two more variables, we have, for example, 


(x41 = Qi + Q/’) =a projection of (x,,;= Qi) 
M (X44 QO; yn (4 = X143 + Xa): 
We complete the proof by induction on the sum of the absolute values of 
the coefficients and on the degree of Q. Oo 


5.5. Second reduction. We now assume that, along with the pair (G, %) 
described in 5.3, we have fixed a representation of E in the form ()™_,£, 
as in 5.4. In this subsection we show that the subgroup H C G generated by 


R is benign if all of the following subgroups H, CG,s=1,...,N, are 
benign: 
G=|{ ap, bo, Cos one ay, 5, » Om a, b,, Cy ey By b, é}|; 


i=0 


m -1 i -lm 
H, = { I arb} ( II ari TI a%b,c7; (xq, .. +s Xe E | : 
To show this, we first set 


m -lf 2 -lm 
A(Xq,-- +s Xn) = ( I arb. ( I arb] II a;bjc*. (1) 
i=/+1 i=] i=0 
The set of words {a(Xo,..- 5 %m)i (Xoo ++ ->%m> EZ"*!} is free, since, 
when we join two such words (or when we join such a word with the 
inverse of another such word), any cancellation cannot involve the “middle 
part” of each word, which consists of the symbols a,, by, C,. 
It hence follows that 


Wy — 
() A= 


s=] 


’ 


N 
| n-th Geos DE () 6 


i=l 


and the subgroup H = ()*_,H, c G is benign if all of the H, are benign. 


Finally, we have 


s=l 
|# U { 41 rts Cet +2 Ame Ome Cini Bs By Cys» +++ Ap bp }| 


1 N 
= II a;*b,c, (xp, ..., x) € E = projection of () E, 


i=0 s=l1 


U {4141 Bras Can +> A Op g} 
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so that 
H=|HU (ara Bete +s bn &}[ | (ao +++ be} 
Therefore, H is benign whenever H is benign. O 


5.6. Construction of the group K. We use the criterion 4.1(a) to verify that 
the H, C G are benign subgroups. That is, we explicitly construct a finitely 
presented group K > G and finitely generated subgroups L, C K such that 


L.AG= H, for all s=1,..., NN. We construct K as a multiple HNN- 


extension of G. 
(a) The first HNN-extension. We set 


Ky=|GU {tp -- +5 tm} + Rol, 
where Rp is the set of relations 


ta 'b.t,= a,b,c; and ¢,'bt,=a,b,¢,fori=0,...,m; 


the s, commute with all the other generators of G 1: (2) 
(b) The second HNN-extension. We set 
K=|KoU (ys lt 1 Sk <iK <j iF ij, kK < mj: Ri, 
where R is the set of relations 
{ t5¢'Bytyn = a,b,c, tix 


1 — . 
ik Gitiik = KG 


the 4, commute with the ¢, and with the other generators of G }. (3) 
Unlike in 5.6(a), here it is not completely obvious that K is an HNN-ex- 
tension of Ky. To check this it suffices to show that the map $, (i,j,k 


fixed, i# k, 7 #k) from the set {generators of G}U {t,} to itself which 
takes 


bbe ab,c;, GP Gs be ts 
and leaves the other generators of G fixed, extends to an automorphism of 
the subgroup |G U {1,}| C Ko. We have: 
IGu {4}|=|G + {if} : 4 bt, = b, ty ‘Gly =cG,..-|, 


where the - - - stands for relations which do not involve b, and Cs and so 
are taken to themselves under ¢,,. On the other hand, the two relations 
which are written out are taken to relations which follow from the defining 
relations in Ky: the first goes to 
te 'a,bjet, = a;b,c;, 
and the second goes to 
ty ty City = tC. 


It remains to use the stipulation that i#k andj #k. 
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It is clear from the definition of K that K is finitely presented. It follows 
from the properties of HNN-extensions that G C K. 


5.7. Construction of the subgroups L, Cc K. The form of L, will depend on 
the equation defining the set E, (see 5.4). We define a large number of 
groups which will include all the L,: 


Lf =|{a(200- «+ 0), g(r #1)}] 
L7 =|{aO--- 0)45,4074i/)}I, 
Lit =|(a(0+ + + O)s tits tite (7 # iJ, k)}I, 
Le =|{a(O-- + 0), tyes tine LF id, k)}I, 
and analogously, in the notation of 5.5, 


Ho =| {aaron Xp) = CF, 


Ay =|{a(xo, ce) Xm)s x, =45}|, 


Hj, =|(a(xo. «+s Xm)> % =H +} 


: | 


Hix =|{a(xo, ee Xb Xp = x;°%;}|- 
The L, are clearly finitely generated. It remains to perform one final series 
of verifications: 
5.8. Hy =6N L;, Hy =GN L; , and so on. 
First of all, it follows from (1), (2), and (3) that 
Pa sey sn acoy Mae a (Ng oo ns Magy HP Maes a Hee =) 
tin'(Xo, ieee Xm) tijk = a( Yo, ae Ym) (5) 


where y, = x; + 1, y, = x, + x, and y, = x, for s #i, k. (To verify (5) recall 
that, since k > /+ 1, it follows that 4, commutes with the middle part of 
the word a(Xo, - . - , Xm), Which consists of a;, b;, ¢, i < 1.) 
It hence follows that 
Le =|Hf vu (4 |r # iI, 
ip? 


Ly =|HF U (at tris}, 


2 


Lig =|Aii U {tits Bilxs ie Gs k} 
Lig =| U { bigs bike ifr Fil, k}]. 


In fact, the inclusions C are obvious. Next, if we begin with a(xo, . . . 5 Xm) 
and conjugate by ¢,, it follows by (4) that we can vary the rth coordinate 
arbitrarily. This immediately gives the inclusion L‘ > H;, and hence the 
first required equality. The second equality is obtained analogously. 
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6 End of the proof 


The third equality: conjugating by ¢,, increases the ith and kth coordi- 
nates by 1, and conjugating by #,t, increases the jth and kth coordinates by 
1, so that we can obtain any vector with x, = x, + x, starting from a vector 
with zeros in these places. 

The fourth equality: conjugating by 4, increases x, by 1 and increases 
x, by x, and conjugating by 4,, increases x, by | and x, by x,. Hence, we 
can obtain any vector with x, = x,+x; starting from the zero vector. 

This new characterization of the groups L, shows that L,q G D H, for 
all s. It remains to prove the converse. 

To do this, we note that, using (4) and (5), we can represent any element 
in L, in the form Th, where T €|{t,, t,,}| (here the set of admissible 
indices i and ijk depends on s) and hE H,. This follows by the same 
argument as above. But, by Proposition 2.7(a), all the {¢,, t,,} generate a 
free subgroup which has a trivial intersection with G (see the proof of 
2.7(a)). Consequently, if Th € G, it follows that T= 1 and A € H,, which 
completes the proof. oO 


6 End of the proof 


6.1. In this section we finish the verification of Proposition 4.3, and hence 
the proof of Higman’s theorem. 

Let G =|{a, b}|, and let H C G be an enumerable subgroup. We shall 
show that H is benign. The first step is to reduce the problem to proving 


that a certain special subgroup 
7 


H'CG'=*Z, 
1 
which does not depend on H, is benign. To define H’, we first introduce 
the following recursive enumeration y : Z* —G (which covers each g © G 
infinitely many times): 


y (2763 eee py 52 hs 25 )= Il qa — Mais ify Ma+2— Maj+3 
i=0 
We then set 
G'=|{a, b, t, v, c, d, e}|; 
7: S({a,b,a7',b-'}) 3G’: TI amb mint 
i>0 
b tI] (van!) (v- boy; 
i>0 

H’=|{r(g)c"de"|g E S({a, b, a7}, phy), nEZt,g= y(n)}| CG’. 
The formula for + defines t on words which are not necessarily reduced, 


and reducing a word can change its image under 7. Also note that a 
generator T(g)c"de” of H’ is uniquely determined by n. 
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6.2. Lemma. If H’ C G’ is a benign subgroup, then any enumerable subgroup 
HC Gis benign. 


PROOF. 
(a) We set 
H” =|{7(h)c"de"|image of h © H,n €Z+,h=y(n)}|CH’. 
Then 
H’ =H’ n|{a, b, t, v, c"de"|n € y '(H)}]. 


In fact, the inclusion C is obvious. The converse follows because the set of 
images of the elements c”de”, n > 1, in the quotient of G’ by the kernel 
generated by a, 5, t, and v, is free. Hence, in any reduced word in the 
generators 7( g)c"de”, the sequence of n’s can be uniquely recovered from 
the word, and, if all the n’s lie in y~'(4), it follows that the word lies in 
A". 

Thus, H” is the intersection of H’ with the subgroup generated by a 
bounded enumerable set of generators (since y~'(H) is enumerable 
whenever H is). Consequently, H” is benign if H’ is benign. 

(b) We set 


H=|{r(A|hEH}|CG. 
It is easy to see that 
|H U (c,d, e}|=|H” U (c,d, e}|. 
Hence, 
H=|H" Ul{e, d, e}| |n|{a, b, v, t}|. 


By Lemma 4.4, H is benign if H” is benign. 

(c) Finally, we consider the homomorphism ¢ : G’— G which takes a to 
a, b to b, and t, v, c, d, and e to |. Obviously, o(1) = H. By Lemma 4.5, H 
is benign if H is benign. O 


6.3. We now prove that the subgroup H’ c G’ is benign. To do this, we 
construct a commutative diagram of group embeddings 


G'’—K'—>K 
HOU L 
with the following properties: 


(a) K is defined by a finite set of generators and a bounded enumerable 
set of relations; L is generated by a bounded enumerable set of words 
in the generators of K. 

(b) L’S L is an isomorphism. 

(c) H=G'OL' in Kk’. 
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6 End of the proof 


It will then follow that H’ is benign. In fact, let K = F/R, where F is 
the free group generated by a finite system of generators of K, Roy is a 
bounded enumerable set of relations between these generators, and R is 
the normal subgroup generated by these relations. It follows from Proposi- 
tion 5.2 that Ry generates a benign subgroup R in F, and then Lemma 4.2 
implies that K= F/R can be embedded in an f.p. group M. When we 
embed K in M, the bounded enumerable set of generators of L remains a 
bounded enumerable set in M (relative to the generators of M), and hence 
L C M is benign by the corollary to Proposition 5.2. Therefore, by (b) and 
(c) we have: the subgroup H’= G’f L is benign as a subgroup of M 
whenever G and L are benign. Hence, there is an embedding of (M, H’) in 
(M, H) such that M is finitely presented, H is finitely generated, and 
H' =H 1 M. This embedding induces an embedding of the pair (G’, H’) 
in (M, H) with the same properties. Consequently, H’ is also benign in G’. 

It remains to construct the diagram of embeddings with properties (a), 
(b), and (c). 


6.4. The group K’. This will be a multiple HNN-extension of G’ which, as 
in Proposition 2.7, we define using four countable sequences of non-trivial 
isomorphisms of the subgroup Re? c, d, e, v~‘av', v ‘bu’ |i > 0}| C G’ with 
G’. Since the elements listed here freely generate this subgroup, it is 
sufficient to indicate where our isomorphisms take these elements. These 
isomorphisms will be induced in K’ by conjugation by four sequences of 
generators x,, x,, y,, and y,, i >0 (instead of the 4, iG J, in §2). The 
following table gives the action of these generators. We use the notation: 
a, = v ‘av', b, = v ‘bv’, p, = the jth prime number. The element in the 


table, say, in the c-row and the x,-column, is x,~'cx,. 


Ji 
th} 


ta; ta;~ : 


tb, 


c cP4i cP4i+l cPsir2 cPsi+3 
d d d d d 
e ePa ePsi+i eP4i+2 eP4i+3 


a, joi a, j2i a, jeri 


| a; ‘aja, j <i | aaa, j <i bab, ji <i | bab", i <i 
tj yj 


a; 'bjajy, j <i aba ',j<i bbb, i <i bbb, i < i 
b, jeri b, fri b, jzi b, fri 
We finally set: 
K’=|G’' U {x;, yi, ¥;|i > O}: the relations in the table], 
and we take G’— K’ to be the natural embedding. 
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6.5. The group L’. We set 
L' =|{ tcde, Xjs Xp Vis Vil 2 O}\c K’, 


and we take L’— K’ to be the natural embedding. In subsection 6.7 we 
shall verify that H’ is embedded in L’ (as a subgroup of K’, in view of the 
commutativity of the diagram). 


6.6. The groups K and L. We set 

K=|G'U (uy, uy, U3, Ug, V1, Vz, 3, V4} 2 RI, 
where the relations R and the embedding K’— K are both defined by the 
conditions 


R = the image of the relations in the table after making the 
substitutions 


RL = ea aera TL 7 She le 
X;bF> Uy Oy), X; bP Uy UzUd, Vib Uy 0343, Vib > Ug V4Ug; 


K’—K is the homomorphism which is the identity on G’ 
and acts by these substitutions on the other generators. 


The homomorphism K’— K is an embedding. In fact, the elements u;~ ‘oul 
are free in |{u,, v;}|, so that K can be considered as the free product of K’ 
and |{ u;, v,|1 < 7 < 4}| with the amalgamation given by the above substitu- 
tions (here we take into account Proposition 2.7(a)). 

Finally, we set 


L = the image of L’ under the embedding K’— K. 


6.7. The diagram has now been constructed. It follows immediately from 
the definitions that it satisfies 6.4(a) and (b). It remains to show that 
H’'=G'Q L' in K’. 

(a) We set [n] = 7(g)c"de” for n€Z* and g = y(n) in the notation of 
6.1. We recall that H’ is generated by all the [”] in G’, and hence in K’ as 
well. 

The table of relations in K’ was composed in such a way so that the 
following relations would be fulfilled: 


Xp '[n]x; =[pain), oe [nl x, = [Pais 17), 
ye nly, =(Phs2k 9" = [Paras 


For example, we verify the first relation. Let n = IIp,”. Then, according to 
the definitions, 


y(n) _ [Lata Marth masz mass, 
J 
[7] = [Tape menibmees masse de”, 
J 
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6 End of the proof 


so that, by the first column of the table in 6.4, 
x; '[n]x; = ta,a;' Il ( soe ); a, Il ( oe ); cPaildePan =[ pain]. 


i<i pri 
If we further take into account that [1] = tcde € L’, we may conclude from 
these conjugation formulas that [n] € L’ for all n, and that H’ Cc L’, as 
promised in 6.5. Moreover, |H’ U {x;, X;,¥;. ¥;[i 2 0}| = L, since the inclu- 
sion C has been verified, and the inclusion > is obvious. 

(b) We now show that in K’ we have 


HU {Xn Es Yn TIE > O}|MG =H 


Since K’ is an HNN-extension of G’, it suffices to show that we are in the 
situation of Proposition 2.7 (as described in the paragraph preceding the 
proposition, at the end of 2.6), and then to apply 2.7(b). 

We verify these conditions, for example, for the first series of isomor- 
phisms of the subgroup of G’, as described at the beginning of 6.4. This 
series corresponds to conjugating by x; in K’. The conditions take the 
following form in our case: 

x | A’ al{s, c, d, e; a, bj > 0}| x, 

= H'n ra hee c, d, e; a, b|j > 0} |x,; 
1.e., if we use the definition of H’ and the table, 

x, 'H’x, = H’ n|{t, 7, d, e'#; a, bj > 0}. 

Since x; '[n]x, =[p,,n], the inclusion C is obvious. Conversely, suppose 
we are given an element in H’ which is written as a reduced word in the 
[n}: IT 0[%,]°, 6 = + 1. We consider the corresponding reduced word g in 
G’. We show that if all the powers of c and d which occur in g are divisible 
by p,;, then all the n, with nonzero ¢, in the above product are divisible by 
Dai 1.€., [nj] € x; 'H'x;. 

In fact, let g = the image of g in |{c, d, e}| under the homomorphism 
which takes ¢, a,, and 5; to 1. Since [m] = c"de”, it follows that all the [7] 
are free, and that g uniquely determines the sequence {¢,n,}. It is not hard 
to see that the formulas which express ¢,n, in terms of the powers of c and e 
which occur in the reduced word g are linear with integer coefficients 
(more precisely, they are a disjunction of linear formulas accompanied by 
inequality conditions). Therefore, if all these powers are divisible by p,;, 
then so is n,. 

This completes the proof. oO 
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